Quantcast
Channel: Information Security News
Viewing all articles
Browse latest Browse all 9687

domain_stats.py a web api for SEIM phishing hunts , (Tue, Jan 17th)

$
0
0

Last year, over the Thanksgiving break, Justin Henderson and I worked ona tool to provide a web API interface foranother tool I released last year called freq.py. freq.py is used to identify randomized DNS names used by malware. Providing a web API would allow a Security Information andEvent Management (SEIM) system to automatically score character frequency data from a variety of log sources. So freq_server.py was born.Since then it has been great to see different organizations using it and finding malware in their domain. ThisThanksgivingJustin contacted me again with a new project and I was immediately intrigued.

This timeJustin wanted to automatically score phishing domains based upon the born on date of the domain registration. Justin told me about some really great techniquesthat organizations can use to identify potential threats if you could query the data at the speed of SEIM. The difficulty is in processing the huge number of records that a SEIMcollectswithout rudely overwhelming the whoisservers. The system has to be quick andwe need to cache the data for frequent domains. Python makes building these types of interfaces quick and easy. Justin told me what he wanted and a few hours later I sent him working prototype. The tool is available for download and integration into your SEIM now. Download a copy here.Check out SANS SEC573to learn how to quickly develop programs like this on your own. There are two opportunities to take the Python course from me in the near future. Come see me">in Londonor come see me">in Orlando Florida at SANS 2017

So what can you do this new tool? Well, it was Justins idea, so Ill let him tell you. Take it away Justin!

Thanks Mark. We live in a day were data is everywhere. The security community is constantly stepping up in new ways to defend and protect this data. They are constantly inventing new techniques, tools, and processes. Yet some good techniques have been lost due to lack of publication or an inability to easily apply the technique at scale.

One such technique is using WHOIS information and specifically creation dates. Many who came before me have made mention that companies typically do not perform business with new domains. I dub these baby domains" />

Running a simple whois kicks back lots of information included a creation date. The problem: performance. Each run of whois varies greatly. In fact, in my testing it would vary between about 0.50 seconds and 10 seconds which if ran against millions or billions of domains would not be able to keep up. Then there is Mark Baggett" />

The number returned if using an older version of the Alexa top 1 million is the sites rank. Just remember that instead of using an old Alexa file you can also generate a custom list of most frequently accessed sites or known good sites or even known bad sites. This can then be used to tag domains in order to skip certain checks that are more time consuming such as WHOIS lookups or perform other tasks/logic.

So how would you put this all together? Where I regularly use domain_stats.py is by invoking it from SIEM products. For example, I take incoming DNS logs and use Logstash, a log aggregator and parser, to query domain_stats.py. If a DNS log matches certain query types such as an A record or MX record, then Logstash applies the following logic:

Step 1 - Does the query match an internal domain name? If yes, no additional processing required. If no, move to step # 2.

Step 2 Pass the domain to domain_stats.py using /alexa. If a result is returned the domain matches a whitelist and no additional processing is required. If 0 is returned it is not a well-known domain. Move to step # 3.

Step 3 Pass the domain to domain_stats.py using /domain/creation_date. Store the creation date for manual analysis. (I then use a dashboard/report to display baby domains that are less than 90 days old)

The ultimate deliverable is a list of baby domains being accessed and knowing which systems are possibly engaging them. This could provide early detection of emails coming in from phishing domains or end users accessing phishing websites. When combined with other techniques such as fuzzy phishing (https://www.hasecuritysolutions.com/blog/fuzzy-phishing) catching phishing domains becomes much more likely and quite frankly a lot more fun.

A big shot out to Mark Baggett for his awesome Python scripting skills that enabled this. While written in Python it has proven through repetitive testing to outperform other solutions. I encourage everyone to consider using domain_stats.py for WHOIS queries (such as for creation dates) or its whitelisting capabilities.

Follow Justin Henderseon@securitymapper

Follow Mark Baggett @MarkBaggett

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Viewing all articles
Browse latest Browse all 9687

Trending Articles