############################# # metaphonerules.d ############################# # by Steve Southwell - BravePoint, Inc. # Copyright 2003 - The FreeFrameWork Project, Inc. # Use freely according to FFW License: http://www.freeframework.org/license.shtml ############################# # How to use: # 1. Ensure that this file is somewhere in your propath # 2. Include metaphone.i in your program # 3. Use toMetaphone("yourtexthere"). ############################# # How to Edit # 1. Lines beginning with # are comments # 2. Anything after the # on a line is a comment # 3. White space is fine, but will not be considered part of the substitution text # 4. Each data line consists of From-text and to-text - delimited by a comma # 5. The defaults provided should work pretty well for English. # 6. If you make improvements to this, please consider passing them along to us! # webmaster@freeframework.org # 7. The ^ character matches the beginnings of words # 8. The $ character matches the ends of words # 9. Substitution is case-insensitive. All-caps are used for clarity # 10. Substitutions are cumulative, so make sure you have them in the right order! # 11. Use the special keyword to indicate that doubled letters should be # combined into a single letter. # 12. To eliminate a certain pattern, just enter it on a line by itself, followed by # a comma ############################# # End results # The text will end up consisting only of the following letters: # B,E,F,H,J,K,L,M,N,S,T,W,X,Y,Z # Each of the letters has roughly its normal sound with the following exceptions: # X = sh or ch sound # Z = th sound # E and Y can only be at the beginning of words # Y almost always sounds like I except at beginning of words: ^Y,% Y,I %,^Y # Oddballs that mess up other rules LAUGH,LAF # LAUGH ASCIS,ASHIS # FASCIST # Beginning of word replacements: ^GN,^N # GNOME ^GH,^G # GHOST ^JUA,^WA # JUAN ^KN,^N # KNOW ^PN,^N # PNEUMONIA ^PS,^S # PSYCOLOGY ^PF,^F # PFIZER ^RH,^R # RHOMBUS ^TS,^S # TSUNAMI ^WHO,^HO # WHORE ^WH,^W # WHALE,WHEN,WHY ^X,^S # XEROX # End of word replacements: AJO$,AHO$ # NAVAJO CHT$,T$ # YACHT EJO$,EHO$ # VIEJO FTH$,TH$ # FIFTH GHT$,T$ # FLIGHT GN$,N$ # FOREIGN GNS$,NS$ # ALIGNS GNED$,ND$ # ALIGNED ILLA$,IA$ # TORTILLA, VILLA - Most likely Spanish pronunciation ILLAS$,IAS$ MB$,M$ # DUMB MN$,M$ # CONDEMN OGH$,O$ # VAN GOGH OUGH$,OF$ # ENOUGH,COUGH,ROUGH AUGH$,A$ # LIMBAUGH # Other anywhere replacements CCE,KSE # SUCCESS CCI,KSI # SUCCINCT PH,F # PHARMACY,STAPH SCE,SE # SCENE SCHO,SKO # SCHOOL SCHE,SKE # SCHEMA,SCHEDULE SCH,SH # BUSCH SCI,SI # SCIENCE GG,K # EGG,BRAGGING # First set of single letters Z,S X,KS Q,K V,F # SH = X except when preceded by S (must be before doubled letter rule) SS,% SH,X %,SS # Before we check for doubles... CK,K # CLOCK,KICKER # SPECIAL SYNTAX FOR REMOVING DOUBLED LETTERS: # More combos: CIA,XA # OFFICIAL DGE,JE # JUDGE DGI,JI # ? ORIGINAL METAPHONE RULE ? GI,JI # GIN GE,JE # GENDER SIA,XA # PERSIAN SIO,XA # TENSION TCH,X # ITCH TIA,XA # MARTIAN TIO,XO # ACTION TH,Z # THIS,THAT,THE OTHER # Still more combos: CH,X # SANDWICH CI,SA # CINDER CE,SA # CENTRAL GH,K # SPAGHETTI,AGHAST # Second set of single letters: C,K D,T G,K # All vowels now become A for a moment E,A I,A O,A U,A # Now special rules that deal with proximity to vowels in general: AH,% %A,AHA %,A # H silent if after vowel and no vowel follows W,% %A,WA %, # W silent if NOT followed by a vowel like WRITE,WRAP,WRENCH # Preserve beginning of word vowel, but delete the others. ^A,^E A, # ALL DONE!