Experience and Skills
SUMMARY OF SPECIFIC EXPERIENCE AND SKILLS
Barbara Ann Kipfer, Ph.D.
I am a lexicographer / linguist and ontologist. I am the Chief Lexicographer of Temnos. I was previously the Chief Lexicographer of Dictionary.com / Thesaurus.com / Reference.com. I also worked for Google, WolframAlpha, IBM Research, and more (CV at referencewordsmith.com).
I have written more than 70 books and am the editor of Roget’s International Thesaurus and million+-selling Roget’s 21st Century Thesaurus. I wrote The Flip Dictionary (huge reverse dictionary), The Order of Things (1000+ hierarchies in the world), Trivia Lovers’ Lists of Nearly Everything in the Universe (1000+ lists). My book 14,000 things to be happy about has sold 1.25 million+ copies and the 25th anniversary edition was published in 2015 (www.thingstobehappyabout.com).
I compiled ALL of Thesaurus.com and added 7,000 new-word entries to Dictionary.com. I mapped 65% of WordNet’s noun synsets to Freebase (=Wikipedia) for Google’s Knowledge Graph. I also contributed 2,800 lists to Wikilists (list.wikia.com/wiki/Main_Page). For Temnos, I have added 11,000 new word entries to its lexicon.
I am not a programmer but, rather, the “human hand” working in such areas as metadata curation, linguistics research, content creation and compilation, improvements to search, natural language processing, semantic analysis and processing, and artificial intelligence. I work very well with engineers and designers and, through this type of collaboration, I make large contributions to projects.
I am a person who needs a mission, who moves fast, needs change, who wants what they work on to positively impact the project and be useful to many people. Everyone I work with is amazed at my output, the pace at which it is completed, and its accuracy and thoroughness.
My expertise includes:
• knowledge curation and managing digital research metadata to facilitate access, dissemination, and preservation of information content and context
• development of back-end dictionaries and thesauri that aid disambiguation and search
• mapping reference works to each other, e.g. dictionary-to-encyclopedia
• expert human curation and prioritization of sources of categorical data and curation of machine-generated categorical data; analysis and development of categorization schemes, hierarchies, and ontologies
• expert human curation of external categorical systems for the purpose of integrating them with internal categorical systems; linking of these categories to category equivalent nodes; mapping taxonomies / ontologies to each other
• assignment of category relationships: compatible, incompatible, superset, subset, parent, child, broader, narrower, other relations
• expert at machine-man collaboration and human evaluation, rating, editing, and correction of an algorithm’s work for accuracy and quality
• expert at assembling online databases of knowledge and editing and verifying them
• expert in lexicography (dictionaries, thesauri) and encyclopedia work; creation of back-end lexical and thesaural databases that make further connections within search and metadata systems
• expert in content creation and question-answering
• expert in reference publishing and reference software
• experience aiding artificial intelligence, natural language processing, Internet search projects
Details of Experience
For IBM Research, I was a research lexicologist for the Language and Knowledge Systems group headed by Lance Miller and Roy Byrd. The group worked on text-critiquing systems and other natural language analysis based, in part, on the study of dictionaries. I performed many analyses of online dictionaries and contributed to the building of the group’s “ultimate” dictionary.
For Bellcore, I spent three years as the non-resident lexicographer for Don Walker and Robert Amsler in the Artificial Intelligence and Information Science Research Group. I brought to Bellcore my background in that field; I really know reference books inside and out. I am an indefatigable worker who glories in data and provided Bellcore with many detailed analyses of lexical materials. For example, I went through all the tables in the machine-readable copy of the World Almanac to explicitly connect the proper nouns with attributes and values so that the names of bridges, mountains, rivers, Nobel laureates, etc. were properly associated with that information. This provided comprehensive IS-A classifications, as well as the beginnings of a set of lexical entries that would contain the properties of biographical and geographical proper nouns. I undertook a meticulous hand-correcting of the genus terms in the definitions of entries in the McGraw-Hill Dictionary of Scientific and Technical Terms so that Bellcore could create a thesaurus based on a taxonomic analysis of those terms.
General Electric Research
For Paul Jacobs in General Electric’s Corporate Research’s Artificial Intelligence Program, I was a non-resident lexicographer and did the bulk of the work in developing a proprietary lexicon of over 10,000 word roots. The main purpose of the lexicon was to provide a core of domain-independent lexicon knowledge for disambiguating word senses in semantic interpretation. The lexicon was very different from most other systems by tying morphology and syntactic information to senses as well as full-word entries, and by associating word sense information with a conceptual hierarchy of about 1,000 concepts. The lexicon was tested and used in database generation and information retrieval applications.
For Bill Gross, I wrote content for many reference software products, including My First Encyclopedia (wrote 2000+ encyclopedia articles for children, including amazing facts and bibliographic information), the adventure series, and Smart Games. For idealab companies such as GoTo, Overture, CitySearch, and Perfect Market, I developed categorization schemes and hierarchies and filled them with vocabulary, including inflections, synonyms, and related words.
At TextWise for Elizabeth Liddy, I worked as an analyst on the expansion of lists of complex nominals and on the development of an ontology. This experience included enhancing and expanding a thesaurus of single words as well as finding synonyms for complex nominals (noun-noun combinations and collocations). I prepared a schema of coverage for the concepts and a list of relations. I also worked on a revision of TextWise’s subject-field classification scheme.
For Associative Computing, Mindmaker, and Textdigger, I was head lexicographer and completed numerous lexicographic projects crucial to the company’s software development. This included creating and editing thousands of lexical entries containing various types of grammatical and semantic information. Starting with WordNet and Mitton’s list, I added entries and equivalents/synonyms, obscurity tags on the sense and equivalent level, proper nouns and classified them by subject, prepared a comprehensive list of collocations and their equivalents, and added derivatives and inflections to lexicon.
For the Bill Gross iterations (the original Answers.com), I was the person who answered every question for the first 18 months; in the second iteration, I was part of an effort to develop a system of cybrarians for answering questions. When I spent a very short stint working for Answers.com (GuruNet), it was for developing content.
When I was hired by Ask Jeeves, I was the person who answered every emailed reference question for a year and then in the Ask Jeeves “community.” I also analyzed the template questions-and-answers created by Ask Jeeves editors, developed the basic plan and categorization scheme of the question-and-answer “community,” and researched best answers to thousands of questions. Though what went on at Ask Jeeves cannot be considered natural language processing, I did compile and edit the dictionaries used in the backend of the system.
At Cymfony for Wei Li, I was responsible for analyzing the lexical resources for use in a system for information extraction, text mining, and question-answering. I selected a tool (DIMAP by CL Research) and prepared a design into which the company will eventually move its lexical resources and build from there. Other tasks have included analysis of the use of WordNet and a more complete verb subcategorization scheme, collection of multiword lexemes to add to the master lexicon, development of a verb hierarchy and assignment of 7,000 verbs to the scheme, and building a 20,000-term business and finance lexicon.
My work for Dictionary.com, Thesaurus.com, and Reference.com was more traditional lexicography – basically creating original content. I also helped guide the choice of public domain and licensed content.
Created all of the thesaurus in Thesaurus.com.
Created reverse dictionary of 70,000+ entries.
Wrote 366 word-question-of-the-day features.
Wrote On This Day (today in history) feature.
Wrote crossword solver of 165,000 entries.
Wrote dictionary of new words (21st Century Lexicon) with 7,000+ entries.
Wrote 26 holiday and season word origin features and 31 “word traveler” features.
Wrote database of 1,400 differences between words.
Wrote style guide.
For Stephen Wolfram, I did lexicographic analysis and special projects involving word data and linguistic data curation. I sought etymological information that was missing in word data so that words could be added to the First Known Use in English feature. Other projects included creating a feature list for word data: frequency of use, rhymes, lexically close words, phrases, other notable uses, crossword puzzle clues, Scrabble score, phone keypad digits. I suggested adding: number of letters in word (length of word in letters), number of syllables, count/non-count if noun (e.g., book is count, love is non-count), ASCII representation of word, Top Ten lists.
Tim Musgrove and Peter Ridge at Federated Media Publishing hired me as Chief Lexicographer to add content to the lexicon underpinning the natural language processing system. The system did text/data mining, deriving information from textual content such as web pages, news articles, and enterprise documents using a complicated lexical system. I created new word and phrase entries for a lexicon that worked in the backend of the system, which read and parsed blog content. The end product was matching up the vocabulary in blogs to advertising.
For Jamie Taylor and John Giannandrea, I did expert curation and prioritization of categorical data and categorical systems into hierarchies to facilitate organizing data in the Knowledge Graph. Within the Knowledge Graph, I performed a number of ontology and taxonomy mapping projects, including assigning category relationships (equivalent, compatible/incompatible, superset/subset, parent/child, broader/narrower, etc.). My major project was mapping WordNet to Freebase (=Wikipedia); completed 50,000 mappings, 63% of the nouns. I also modeled the future dictionary entry format for Rich Snippets. A second short consultancy as a linguist for Dave Orr (Common Sense group in Machine Intelligence) involved working with multiple sense inventories, including their characteristics and issues in merging and creating alignments, as well as how to make use of corpora annotated with them.
I work for Tim Musgrove and Peter Ridge doing lexicographic and ontological analysis and development for semantic technologies as Chief Lexicographer. I have compiled 11,000+ new word entries for the lexicon. I make improvements via theme files, tag files, topic maps, metatopic maps, lexicon edits, etc. I evaluate beta projects before they go live.
Reference Wordsmith entails freelance data curation, semantic research, natural language processing, search engine and information retrieval, question-answering, and artificial intelligence research as well as developing and mapping ontologies, hierarchies, taxonomies, and classification schemes. Some clients: Ameritech Publishing, BellSouth Publishing, Columbia University Press, Dorling Kindersley, eHow, Fitzhenry & Whiteside, Funk & Wagnalls, Future Vision Multimedia, Grolier, Happier, Happify, Hub Pages, Laurence Urdang, Longman Dictionaries, Macmillan Reference, Merriam-Webster, OCLC, OneLook Dictionary Search, Random House Reference, Scholastic Reference, The Free Dictionary, Time Warner, Wordster.
Conducted analyses of multimedia product containing multiple major reference books and developed plan for improving the index/browser, access, and presentation of the text.
Contributed thousands of lists for Wikilists (Wikia).
Copyedited and proofread content and documentation for multimedia reference products.
Created collection of questions with paraphrases and sentence simplification for natural language system. Prepared corpus annotation for natural language processing.
Creation of ontologies, taxonomies, and hierarchies for clients such as CitySearch, CNET, Columbia Encyclopedia, Farlex/FreeDictionary, Funk and Wagnalls Encyclopedia, GoTo/Overture, Grolier Encyclopedias, Hubpages, Merriam-Webster, OneLook, Wordster.
Designed and compiled a variety of word books.
Designed and compiled A-to-Z thesauri and conceptually arranged thesauri.
Designed categorization scheme and encoded 200,000+ articles in encyclopedia sets.
Designed categorization schemes for major reference works.
Development of lexicons and major reference works.
Dictionary entry writing, revision, and copyediting.
Edited databases of reference materials for software products, including spelling checkers, thesauri, dictionaries, encyclopedias.
Prepared Americanization of British dictionary content.
Prepared pronunciations in a respelling system for dictionaries.
Researched and wrote many how-to / instructional articles.
Set up hierarchical schemes and synonym tables for online shopping, city directories, and search engine.
WordNet complete revamping: added synonyms, removed erroneous synonyms, enhanced with weighting of terms (obscurity coding) at sense and word level.
Wrote lexicon of most frequent 10,000+ words, assigned to a 1000-concept hierarchy for use in a natural-language processor.
Wrote workbook on lexicography for dictionary users and students.
Always meet deadlines, often finishing ahead of target.
Experienced at writing proposals and reports based on ideas, needs, and research.
Extremely good organizational and planning skills.
Have supervised assistants, worked closely with programmers/engineers.
More than 30 years’ experience as a teleworker; very willing to make regular and special visits to headquarters.
Offer a unique combination of talent and training: PhDs in linguistics, archaeology, and Buddhist studies; work in artificial intelligence research, reference publishing, reference software, Internet content and question-answering; and development of categorization schemes, hierarchies, and ontologies.
Strength is self-motivation to complete tasks beyond expectations and on-time.
Very communicative and easily reached via email, video chat, and phone.
My PhD and MPhil in Linguistics illustrate the kind of work I am capable of doing independently. Both dissertations were well received by the lexicological community. I also have a PhD in Archaeology and a PhD in Buddhist Studies.
Though I am not a programmer, I have a great deal of experience in analyzing data computationally, and I am extremely comfortable working with utilities like Microsoft Excel and Word, Google Drive, and Mathematica. Given a problem, I am effective in acquiring and analyzing the data required to solve it. As a resource person in the lexicographic area, I do not know anyone who is more competent than myself. My skills and experience are ideal for that role and for other tasks involving working with dictionaries, thesauri, encyclopedic and other reference works, and taxonomic/categorization schemes (hierarchies, ontologies).
I have a great deal of experience developing content and developing frameworks for content for question-answering systems and for natural language processing. Also, my extensive experience in thesaurus and dictionary work points to something that I could provide or expand: a back-end dictionary and/or thesaurus that makes further connections within the search and metadata systems.
My tremendous collection of lexical resources that I developed over 40 years – dictionary and thesaurus material – may be used to assist efforts in semantic technology.
I have worked as a telecommuter for more than 30 years. I work very well by myself, but I also interact productively with other people. I have mainly worked for California-based companies and the time and geographic difference are undetectable as I get things done right and I get them done quickly.
I consider myself a lexicographer with computational and theoretical experience and a yen for electronic adventure. I like to be a part of a team building unique tools for working with language. I have experience building large-scale computational lexicons, designing and developing new lexical components, and making use of online knowledge bases. I am familiar with lexical issues at various levels and familiar with popular linguistic theories and their implications for content and structure of the lexicon.