COMPUTERS
· Languages : C, C++ (STL), Perl, Pascal, SQL, Applescript, PL/I, VBA, some PHP and ASP.NET.
· Algorithms Invented : hyphenation, spell correction, grammar checking, terminology extraction, syntactic parsing, morphological discoverability, language identification, context prediction, word list compression, morphological parsing, automated parsing of compounds, translation identification, lexical database management (among others).
NATURAL LANGUAGES
Native: English
Near-Native: Norwegian, Russian
B.A. (valedictorian): German
Read fluently: Swedish, Danish
Read semi-fluently: French, Dutch
Read with a dictionary: most Indo-European East and West European languages
EDUCATION
University of Trondheim, Norway
· Doctor Philosophiae (the older and higher of the two PhDs offered in Norway) -- dissertation on sound symbolism What's in a Word.
GPA: N/A
Dates: November 2001.Massachusetts Institute of Technology
· PhD candidate in formal theoretical linguistics. I have completed all necessary course work and qualifying exams. Left MIT prior to graduation in order to start Circle Noetic Services.
GPA: 4.9/5.0
Dates: Sep. 1981 to May 1985.University of Leningrad
· Studied Russian syntax, phonetics, literature, history.
GPA: 5.0/5.0
Dates: Sep. 1980 to Jan. 1981.Colorado State University
· B.A. in German with minor in mathematics. (completed all but one course in advanced calculus required for a B.S. in math)
· Graduated summa cum laude with Honors, chosen 1979, 1980 best Russian and German student at CSU, Phi Beta Kappa, awarded the President, Honors and Westfall Scholarships. Was the only student to ace the 2nd year university-wide calculus final in 1979 (volumes, methods of integration, convergence, infinite series).
GPA: 4.0/4.0.
Dates: Sep. 1976 to May 1977, Sep. 1978 to May 1980.University of Oslo
· Grunnfag degree in linguistics.
GPA: 1,9 (a good mark)
Dates: Sep. 1977 to May 1978.
EXPERIENCE
Nuance (Oct. 2012 - present)
I am working as a principal software engineer.· I built a translation memory type of app, about 6000 lines of code in Django with a SQL Server back end for managing the Swype localization strings. We couldn't use off-the-shelf products, because it had to be able to manipulate translatons into 80 languages simultaneously
· I built their Hot Words phrases app.
· I built syllable checkers for languages with Indic scripts.
· I tweak various tools and applications: class-based language model, machine learning tool to extract related terms using a seed list,...
· I built the word list for their Dragon Drive app.
· I build language models for Swype.
· I manage localization for Swype which in 2015 was about 200,000 translations into 80 languages.
Nuance (Oct. 2012 - present)
I am working as a principal software engineer primarily for their Swype application.Swype (Oct. 2011 - Oct. 2012)
I was working as a senior computational linguist for the Android app Swype.SDL International (Feb. 2002-May 2011)
I worked as a senior computational linguist for their machine translation software. I primarily wrote supporting software in Perl, SQL and C++. This includes:· I wrote a translatability index algorithm, which uses various heuristics to guess at the quality of the machine translation for a given sentence into a given language. Machine translations were ranked by humans, and I then compared programmatic codes created during the machine translation of these sentences with the assigned rankings to create a mathematical algorithm.
· I built a 70+ page documentation Web site which unified and made more accessible existing documentation and also added much new documentation concerning directory structures, available tools and utilities, undocumented parts of the MT engine structure, etc.. Most of the content was created by my colleagues -- I did the graphics (in Photoshop), html Web pages (in PageMill and FrontPage), and also wrote some of the content.
· I wrote a file comparison utility in Perl which highlights sentences which are missing in one or the other file, and specific words which differ when comparing two similar sentences. Output is in html.
· One of my colleagues and I proposed an experiment in which we combined the German-English MT system with the English-Spanish MT system to create a 'Frankenstein' German-Spanish MT system. I wrote Perl scripts for this system which combined the dictionaries and unified and made consistent the tables, and my colleague did all the rest. The result was as expected not wholly wonderful nor wholly useless.
· I installed mySQL on a Windows 2000 machine and built SDL's first and only multilingual 'sentence database' on the local network using (primarily) MS Access as an interface via ODBC. This project also involved selecting and classifying sentences of various types in various (human) languages, marking their syntactic structures, etc.. My idea was to make debugging of MT more efficient by allowing developers to target sentences for machine translation which contained those linguistic structures which they were trying to improve upon.
· I put the SDL dictionaries into a relational database for the first time to create a lexical database. This involved a C++ program which significantly reorganized and reformatted the existing dictionaries, as well as the construction of a mySQL relational database and an MS Access user interface. This is a much more flexible system which makes the dictionaries considerably easier to modify, access and analyze.
· Designed and wrote (single-handedly in C++ using STL) a table-driven Phrase Finder technology for 6 European languages. Phrase Finder was the technology which made possible SDL's Knowledge Based Translation System for which a one million dollar contract was signed within a few weeks of release. The first review states, "The evaluation of ExtraTerm (the leading competitor) and PhraseFinder has been performed on a limited number of tests... However the results obtained have always shown considerably higher values when working with PhraseFinder... In general, I found extraction with PhraseFinder very effective, certainly more than with ExtraTerm, as it provides almost always valid candidates." It uses linguistic methods to identify idiomatic phrases and terms in the text, and outputs these phrases along with relevant data such as their frequency, ranking, part of speech, root form, etc.. It has the capacity for a full-fledged syntactic parse using a parse engine of my own design. This parser discovers all possible parses for a given set of rules on a given sentence and represents them in a single, non-redundant (STL-based) data structure. The engine was also written so that all lexical and linguistic data is read in from relational databases. This way the computational linguist has a great deal of control over the system through the data without having to modify the code and can experiment simultaneously with several alternatives.
· I built a simple (MFC dialog-box based) GUI for the PhraseFinder for my own testing. This was, however, replaced for the final product with a more sophisticated one written by one of my colleagues.
· I designed and wrote a terminology mapping technology which reads in large databases of humanly translated sentences and (using the lexical database and parser mentioned above) finds which words in the source language map to which words in the target in any one of 15 language directions (English to German, French to Italian, etc.). (This is especially important for specialized texts.) Trados also has a technology similar to this one, which runs 10-12 times faster than mine, but mine is more accurate. Mine misses about 10-15% of the terms in a file, but for translations it does propose, it is over 95% accurate. Trados offers translations for nearly every term, but is only about 50% accurate.
· I designed and wrote an inflection tool in Access (requiring about 2400 lines of VBA code) for internal use. Using an Access table of words in some language, it sends the user through a sequence of Wizards which set up a new language, and then provides forms which allow him or her to build inflection rules and paradigms for the words provided for that language. The user can, for example, open a form which displays the word and any grammatical information the user has set for the word, including the word's inflection paradigm. The word is then inflected real time using that paradigm. The tool automatically chooses the possible and likely existing paradigms for the word (based on a match between the word structure and the paradigm structure) and fills in a drop-down menu with the relevant possible paradigms. The user can then change the paradigm and immediately see the effect of the change, or can define a new paradigm. Once the user is happy with a paradigm, it can be propagated automatically to similar words. Using this tool, I was able to inflect every word sense in the 6 SDL translation dictionaries (which on average contain about 40,000 word senses each -- i.e. a total of 240,000 word senses) in about 3 weeks.1994-2002
In 1994, I sold my share of Circle Noetics, and spent the next years developing research in a neglected branch of linguistics which particularly interests me. During this time I have:· Published a book, The Gods of the Word,.
· Co-founded the Linguistic Iconism Association, which now has over 300 members.
· Founded a peer-reviewed academic journal Iconicity in Language.
· Developed an extensive award-winning Web site about sound symbolism and linguistic iconicity.
· Written a 700 page Dictionary of English Sound, soon to be published by Weidler Verlag.
· Written a doctoral dissertation entitled What's in a Word? for which I received a PhD degree from the University of Trondheim in Norway. Soon to be published by Weidler VerlagIn addition I have in recent years taken on various other projects that interested me:
· I worked as a subcontractor for Microsoft to write them Italian morphological parsing software which they incorporated into their electronic book devices.
· I've designed Web Sites for a few local businesses and individuals.
· Taken some freelance translation work from German to English and from Norwegian to English. Translated a Norwegian number one bestseller, Veien til Karlsvogna, at the request of its author.
· I've done some Internet-related commercial Applescript applications.Circle Noetic Services
Position: CEO, co-founder, developer
Dates: 1985-1993.
Major products: OEM linguistic software Dashes hyphenation algorithm, Password spelling checker, WordFan morphological parser and generator, InOtherWords lexical database. Dashes and Password are sold as stand-alone products on the Mac and IBM PC, and all products are sold on an OEM basis to be incorporated into other software. Customers include(d) Linotype, LetraSet, Atex, RagTime, Quark, Microsoft, US West, among others.Duties at CNS:
Software Development:
· Was the primary designer of all CNS software. In particular:· Designed all of the the Dashes hyphenation software (in C) for 18 languages and was responsible for maintenance of all data files.
· Co-designed a linguistic compression method for the spelling checker.
· Designed and wrote thousands of entries for the "In Other Words" lexical database, including a word net, semantic selectional restrictions, morphological parses, syntactic subcategorization frames, pronunciation tables, etc..
· Designed a text retrieval tool, WordFan (in C), which given an input word, generates a list of morphologically related words. There were two such programs -- one dictionary based and one algorithmic.· I also maintained and developed most of the data files and many of the dictionaries for Password. This combined with the Dashes and Word Fan development involved reading, typing and analyzing millions of words in 28 languages.
Management:
· Was responsible for managing all of the employees and subcontractors, with the exception of some of the developers of the Spanish dictionary in 1988 and one programmer. This involved hiring, supporting employees, solving personnel problems, setting company policies, employee evaluations, firing, and training.
· In 1988-1990, I was responsible for managing all orders, customer requests, prioritizing projects and seeing to it that they were completed within a reasonable amount of time. Projects included mass mailings, trade shows, software development, software ports, sales, contract negotiations, and correspondence.Finances
· In 1988 and part of 1990, kept all the books for the company -- social security, estimated taxes, payroll, accounts payable, accounts receivable, etc.
· Handled collections and organized auditing of companies that were delinquent.
· Through 1988, I did all income taxes.Sales and Marketing
· Designed the company sales and marketing strategy.
· Wrote the copy for all advertisements, press releases and company literature.
· In 1988 and 1989, I organized all printing, trade shows and managed all contact with the press.
· Did many sales calls and negotiated most contracts. I also revised contracts of various types based on prototypes provided by our lawyer.
· Wrote several technical manuals for various software applications.University of Trondheim, Trondheim, Norway
Position: post-doc.
· Initiated and did the first entries for a lexicon project for the U. of Trondheim Linguistics Department, now called TROLL, which is still active and employs people.
Dates: November 1986 - May 1987Compugraphic Corporation
Position: programmer.
· Designed and wrote a hyphenation subroutine for their DTP program which was to run on the Apple LISA. (in Pascal)
Dates: June 1985 - August 1985Massachusetts Institute of Technology
Positions:
· Research Assistant in a government funded Cognitive Science research project to design compact dictionary databases and efficient methods for accessing them.
· Teaching Assistant in introductory linguistics courses.
· Resident Assistant (or counselor and academic tutor) for an undergraduate dormitory, MIT's Russian House.
· Grader and evaluator for essays required of undergraduate applicants and entering freshmen.
Dates: September 1981 to May 1984.National Center for Atmospheric Research
Position: Programmer
· Was the only person on a project to design and code a computerized library circulation system for the NCAR library, including check-ins, check-outs, renewals, overdues and statistics. (PL/I) I implemented all the functionality, but didn't have time to debug it fully before I left to return to graduate school.
Dates: Jun. 1982 to Aug 1982.Position: Programmer
· Was the only person on a project to design and code a full text retrieval system for a meteorological database which NCAR licensed from Lockheed. (PL/I)
Dates: Jan. 1981 to Aug. 1981.
PUBLICATIONS
· "Review of Gruber's 'Look and See' and Jackendoff's 'Grammatical Relations and Functional Structure", Lexical Semantics in Review, MIT Center for Cognitive Science, 1985.
· I wrote quite a few manuals for software products in the years 1985-1990. I retain some examples, which I can produce on request.
· "Mincing Words", Language International, 1998.
· The Gods of the Word: Archetypes in the Consonants, Truman State University Press, 1999.
· "I Am the Utterance of my Name", Ariadne's Web, 5 (2), Winter 1999/2000.
· Have published 13 of my poems with 7 different reviewed journals or reviewed poetry Web sites.
· Founder of the academic journal, Iconicity in Language.
· "Review of Gérard Genette Mimologics", Iconicity in Language, 2001.
· A 700 page sound-symbolic dictionary A Dictionary of English Sound, 2002, to appear by Weidler Verlag in Berlin, Germany.
· What's in a Word: Studies in Phonosemantics, my doctoral dissertation, 2002, to appear by Weidler Verlag in Berlin, Germany.
· The Oxford Handbook of the History of Linguistics, Keith Allen (ed.), Chapter 9: Sound Symbolism, to appear March 2013.
PERSONAL
I was born Margaret Hope Magnus in Boulder Colorado, and grew up primarily in Colorado and also in Trondheim, Norway. Both parents and one brother are/were mathematicians, and the other brother is a theoretical physicist, so I'm the only one who went philological. I play piano, ski, hike, and play basketball. The last couple years I've been a member of an orienteering club, though I'm not very good at it yet. I can get addicted to games like chess, Set, bridge and Boggle. Favorite authors/works include (in no particular order) Ralph Waldo Emerson, T.E. Lawrence (of Arabia), Adam Smith, Joseph Campbell, Shakespeare, Tolkien, Goethe, Dostoevsky, Herodotus, Anna Akhmatova, Ayn Rand, Alexis deToqueville, A Course in Miracles, Richard Feynman, Agatha Christie, Conan Doyle. At present I live in Southern New Hampshire with my two children. I am an American citizen.
REFERENCES
· Jay Marciano, Director, Real-time Translation Development at Lionbridge and my former boss at SDL.
· Thomas Everth, software engineer TE Software, former CEO and co-developer of the RagTime DTP software (a client of Circle Noetics) and a former colleague at CNS.
· Deric Villaneuva, Director of Operations at Swype.
Other references available upon request.