Two researchers from AT & T, Wei Wang and Kenneth E. Shirley had a domain related paper published on the Cornell University website ArXiv.org. The paper looks at whether using certain word patterns and data can be useful in spotting malicious domains.
In recent years, vulnerable hosts and maliciously registered domains have been frequently involved in mobile attacks. In this paper, we explore the feasibility of detecting malicious domains visited on a cellular network based solely on lexical characteristics of the domain names. In addition to using traditional quantitative features of domain names, we also use a word segmentation algorithm to segment the domain names into individual words to greatly expand the size of the feature set.
SourceAmong the largest 400 out of these 5327 coefficients (i.e. those most strongly associated with maliciousness) were several words that fell into groups of related words, which we manually labeled in the following list:
1) Brand names: rayban, oakley, nike, vuitton, hollister,timberland, tiffany, ugg
2) Shopping:dresses, outlet, sale, dress, offer, jackets,
watches, deals
3) Finance: loan, fee, cash, payday, cheap
4) Sportswear:jerseys, kicks, cleats, shoes, sneaker
5) Basketball Player Names (associated with shoes):kobe, jordan, jordans, lebron
6) Medical/Pharmacy:medic, pills, meds, pill, pharmacy
7) Adult:webcams, cams, lover, sex, porno
8) URL spoof: com














