Posted by Geraldine Hunt on Fri, May 1st, 2015
You've just received a report from your hosting provider or a third party that your site is hosting phishing content. While your first instinct may be to remove the files, change passwords, and update everything as fast as possible, that would be a big mistake. I know you want to get that Apple login page off of your site immediately. However, if you take a few minutes to gather information, you will have a much better result than if you do not.
In this article we'll discuss what to do if your website has become infected with phishing content, including how to find the vulnerabilities and clean your site of phishing content. Depending on the type of hosting you have, your provider may be able to help. In any case, I recommend that you consult with a professional who has dealt with this type of issue before.
Step 1 – Find the source of infection
The priority is to find the source of the infection. Don’t make any changes before conducting a full investigation; it reduces your chances of finding the source. Hackers are very good at covering their tracks. Additionally, the original vulnerability is normally only exploited once to upload their own malicious software. Most clues are not going to point to the original attack unless the hackers were careless. So, you need every clue available.
In addition to finding the vulnerabilities in the site, you need to find all of the uploaded phishing content. There are likely more than one. Phishing sites are particularly tricky to find since they are designed to not disrupt your site. Their content appears to be a normal website to scanners, making automated detection ineffective.
How does infection with phishing content happen?
Before we go into the details, it's useful to understand how and why sites get infected with phishing content. Often phishers use compromised computers to host malicious content and actions, including identity theft, financial fraud, as well as the harvesting of sensitive data from victims for future illegal use. Others hack to gain administrative control over legitimate websites of businesses and organizations in order to be able to disguise their phishing activities.
The goal of a phishing site is to gain access to a user's login credentials. The most common targets are banks and large companies like Apple, Ebay, Google, Paypal, and Microsoft but realistically any size organization or business can be a target. These companies have their own dedicated security teams and they also get help from government agencies and private security companies. The result is that phishing sites are normally discovered and removed fairly quickly.
This forces hackers to work fast using automated tools to maintain their network of hacked sites and email servers. Therefore, any roadblocks you have in place reduces the chances that hackers will target your site. It's much more effective for them to move on to the next insecure site.
The most common reasons for a site or server being compromised are unapplied updates, neglected development sites, weak passwords or test accounts, and insecure custom scripts. Read our recent blog post on the need to update your applications and use strong passwords.
Is your Development site secure?
One thing that even experienced admins sometimes overlook is the security of their development site. If hackers gain access to your account via a vulnerability in your development site then they will have access to your regular site, too. If you use a development site, do not neglect its security. It's also highly recommended that any development sites or test accounts be removed once the work is complete.
Network Security Investigation
During any security investigation, keep detailed notes about what you find as you go along. Remember not to remove or modify anything, no matter how strong the urge is. Take note of the full path to any phishing sites you find as well as any malicious files, code injections, and scripts with unsafe code. Also keep track of their timestamps and any related log entries. Afterwards, you can use this information to determine the best course of action and to remove the malicious items.
Many times, you will have been given a report about specific files. But sometimes the information you have is limited. Several security experts I’ve spoken to like to use Sucuri Sitecheck to check for any known issues.
For this example, we have a report of a phishing link pointing to our Wordpress site which is running on a cPanel server.
This will be a basic example of the most common tactics used by hackers. There are many variations, and, frankly, even the professionals get stumped sometimes. After all, hackers are smart and they change their tactics constantly.
That said, most security professionals approach a phishing investigation never having seen the site and not knowing about any of the custom coding. The procedure discussed below may seem like a tedious process but with practice you can completely investigate and clean a standard sized website in 10-15 minutes. This assumes a Linux/Unix Apache Web Server.
First Things First! Review timestamps of affected files.
First, you need to review the timestamps of the affected files. You didn't alter or remove the files did you? If so, don't worry. You'll likely find more malicious files later in the investigation.
To see the complete timestamps you need to use the “stat” command. Use it on the file you already know about.
# cd /home/user/public_html/Apple
# stat securelogin.html
Size: 37062 Blocks: 80 IO Block: 4096 regular file
Device: 807h/2055d Inode: 101845956 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 500/ user) Gid: ( 500/ user)
Access: 2015-04-15 09:11:12.000000000 -0400
Modify: 2015-04-10 15:35:11.000000000 -0400
Change: 2015-04-12 10:15:27.000000000 -0400
If you look closely at the above information you can see that the Modify and Change dates are different. Modify refers to the content of the file while Change refers to the metadata ( i.e. filename, permissions, etc. ). The Modify date is usually the most important but both may be needed during the investigation. Take note of these.
Step 2 – Compare to Logs
Now that you have a time for when some of the changes happened, you can compare it to the logs. The Apache access log is the best place to start. If nothing is found in Apache then you would most likely find an upload in the FTP logs or the cPanel logs. Since the Apache logs are the most common place to find the information you need, let's focus there.
You mostly need to see the POST commands so filter the logs for POST, date, and hour.
# grep POST /usr/local/apache/domlogs/user/example.com|grep '10/Apr/2015:15'|less
The timestamps will not match up exactly. Apache records the access time which is at the start of the transaction, while the file system records when the last write to the file. Keep this in mind when reviewing the logs.
ARIN database & GeoIP
In this case you find several hundred entries like the one below. ( Please note that this is just an example so the IP is not valid )
126.96.36.1995 - - [10/Apr/2015:15:35:03 -0400] "POST /wp-content/uploads/2013/06/xXx.php HTTP/1.0" 200 42726 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
There are several obvious signs of malicious activity in this record. Why is there a PHP file in the upload directory? This area is normally reserved for images and documents. It should not be there. It's also using HTTP 1.0 and does not provide a referrer link which are signs of a scripted request. And the request is identified as Googlebot but if you check the IP address against the ARIN database or the GeoIP Tool you would see that it does not belong to Google.
Thrown off the trail ? Common Hacker Tactics.
Now you have a trail to follow so examine the “xXx.php” file.
# cd wp-content/uploads/2013/06
# stat xXx.php
stat: cannot stat ‘xXx.php’: No such file or directory
How can this be? The file has been moved or deleted by the attacker to throw us off the trail. It's a common tactic but it won't slow you down too much. It should be found later if it was moved. For now, just move on to the next clue, the Change time. Review the logs again looking for the Change time you saw earlier. Now there is a link to a different file using a new IP and user agent.
188.8.131.525 - - [12/Apr/2015:10:15:18 -0400] "POST /wp-content/themes/twentythirteen/js.php HTTP/1.1" 200 42726 "http://www.example.com/index.php" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0"
This request doesn't appear to be scripted like the previous one. However, they are POSTing to a file in the theme which is unusual. Review the contents of the file to see if we can find out what they were doing.
# cd wp-content/themes/twentythirteen
# less js.php
Obfuscated PHP code
This is obfuscated PHP code which doesn't necessarily mean that it is malicious. There are valid reasons for doing this and it could really be part of the theme. You can use UnPHP.net to decode it. While that service doesn't always completely decode it, usually it will give you enough information to determine whether it is malicious or unsafe.
If you decode the above, you will find that it is a standard script for uploading images. ( I didn't want to include any real malicious code here ). Either way you look at it, this file was used by the attacker. Add it to your list and keep following the trail by matching file timestamps with log entries.
When to stop the investigation.
If you follow this same process on the “js.php” file you may find that it leads to yet another file to trace, points to a vulnerability in our application or theme, or it could even point back to itself. But let's assume in this case you just didn't find anything suspicious in the logs.
That's OK, sometimes it's not so obvious. Filter the logs a little differently this time to view all of the POST's from that IP.
# grep POST /usr/local/apache/domlogs/user/example.com|grep 184.108.40.2065|less
You may need to filter the results more to find the important information. Let’s say that you found the following:
220.127.116.115 - - [21/Apr/2015:10:52:21 -0500] "POST /wp-login.php HTTP/1.1" 200 1937 "http://www.example.com/wp-login.php" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0"
18.104.22.1685 - - [21/Apr/2015:10:53:03 -0500] "POST /wp-admin/theme-editor.php HTTP/1.1" 200 17862 "http://www.example.com/wp-admin/theme-editor.php" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0"
This indicates either a compromised password or a vulnerability in Wordpress. You could review the logs further to find out which it is. To save time, at this point I would recommend applying updates if they are available and resetting all admin passwords.
However, you shouldn't stop the investigation here. Whether this is the original compromise or not there are likely to be more malicious files than we know about. Many times there are even multiple separate compromises over an extended period of time.
Continue Searching – Recently modified Files & Directories
Now that you've exhausted the list of known malicious files you need to be creative to find the remaining items. (Hope you didn't modify any files yet. The results of these tests may be skewed if you did.)
For this part of the investigation you need to start in the document root directory of the affected user's account. Examine the directory structure and the most recently modified files using the following command:
# ls -lrt
Right away you may see directories with the name of a company or that don't follow your naming conventions. Also look for files with timestamps out of sync with the surrounding files. You'll need to examine the contents and relevant log entries for any suspicious items you find as you did previously.
To find most of the phishing content it's best to examine each directory 2 to 3 levels deep from the document root directory. In each directory examine the directory structure and contents of the files that appear out of place. If they appear to be malicious, compare them to the logs and of course document everything.
Finally, let's go back to the document root directory and run some targeted searches. Start by searching for any files that have been modified during the last week but we may need to increase this to 30 or 90 days. If the list is too long, narrow it down with grep. You can even filter out files you are already aware of.
# find . -type f -mtime -7 | less
# find . -type f -mtime -7 | egrep -v 'cache|log|Apple' | less
It's sometimes useful to check for files with recent Change times. For the most part, the results won't differ too much but there could be some key findings.
# find . -type f -ctime -7 | less
Symlinks are commonly used in an attempt to break out of a user's account. Review any items you find with the following command. Most websites do not use symlinks at all and are almost always safe to remove .
# find -type l
You're almost finished but let's run one more search to make sure you've found everything. Here is a command often used to detect code obfuscation, common with malicious code:
# egrep -l 'eval|gzinflate|base64_decode|str_replace|str_rot13' *
Of course, this will also detect many items that are valid parts of your site. Crosscheck each file with the coding and log file access to ensure they are not only valid but safe and protected code.
Cleaning up the Phishing mess!
Now that you have a list of malicious files and you've identified any vulnerabilities in your sites, you need to get out of this mess! Make a backup now before proceeding. If it's a large site I would have actually started the backup at the of my investigation. Yes, even a backup of the compromised site could be useful. A good admin is never too careful.
After the backup, go ahead and completely remove any malicious files and reset any passwords that may have been compromised. Usually the code injected files will need to be cleaned manually. Open those files in an editor of your choice and remove the code injections. Malicious code injections are commonly found on the first or last line of a file (but not always).
A Good System Administrator is never too careful.
If you identified any patterns with the malicious requests in the logs then you can block them with .htaccess rules. For instance, if all of the requests were from the same subnet or even country you could block them temporarily until you have fully secured your site. Here is a great reference for htaccess coding.
You can now update any applications on your site and repair any insecure scripts you found in the investigation. When updating your software be sure to check for things like Timthumb and TinyMCE as they are frequently overlooked. If you're running a CMS like Wordpress then updating the following is usually enough to repair the issues:
- the Wordpress core
- all plugins
- and themes
But if you have a custom built site then it's likely that you found an insecure upload form during the investigation. You'll need to add validation to the script to prevent its misuse. The simplest method is to add a password into your script so only you can use it.
You've found the vulnerabilities in your applications. You’ve already :
- removed the malicious files
- reset the passwords
- updated your site.
It's now clean, safe, and no longer hosting malicious content! However, your users are still seeing a safety warning from their browser or antivirus. Fortunately, getting de-listed from these lists is much easier than getting de-listed from a spam blacklist.
The most common list to get reports on is Google's Safe Browsing. You can use the following link to request a de-listing. Google will then crawl your site again and check for any issues. If none are found you should be removed from the list soon.
The most common mistakes we see causing security issues are the same across the board. Unapplied updates, neglected development sites, weak passwords or test accounts, and insecure custom scripts.
I'm sure you found at least one of those problems during this process. Don't worry, it happens.
How you can minimize your web site’s vulnerability
How to minimize your web site’s vulnerability to attack by phishers.
- Server OS hardening. “Hardening” is a process of securing an operating system so that it is difficult to attack. Use commercial and open source vulnerability scanners and security baseline analysis tools to identify
- Web application hardening. Web application hardening is a process of securing web server application software, web applications and scripts, and dynamic content against attacks. Again, use commercial and open source web vulnerability scanners to identify improper configuration settings and exploitable content. Consider using a commercial or open source web application firewall and content filtering technology to provide in‐line, real time examination of incoming web traffic for attack patterns and anomalies.
- Patch management. Maintain current patch levels on all operating systems and applications used for your web site.
- Secure programming, safe scripting. Do not use executable programs without verifying the authenticity and trustworthiness of the developer and the integrity of the code itself. The Open Web Application Security Project (OWASP) is a useful source for learning about secure programming and safe Scripting.
- Compartmentalize. Create security domains within your network and separate these with security systems (e.g.firewalls) so that successful attacks against one server or service can be contained.
- Routine examination. Perform regular network, host, and web vulnerability and penetration tests. If possible, have an independent, experienced, and certified party perform a security assessment on systems that support your web site.
- Implement best practices for firewall filtering. Restrict traffic flow at firewalls as tightly as practical.
- Only allow access to TCP or UDP ports where your authorized services are listening, and further restrict flows to the IP addresses of the systems on which you are hosting listening services. Restrict outbound traffic flows from servers as well.
Now that all is well with your site, keep these things in mind. Log into your site regularly to check for updates. Use strong passwords and remove any test accounts and development sites as soon as you have finished with them. And if you have custom scripts be sure to add some validation to your code to prevent misuse. All of this will help make a phisher’s life difficult: And remember a good system administrator is never too careful!
The first line of defense for a company is the system administrator, the tireless hero who works 24 X 7 X 365 to ensure that the IT infrastructure is always secure. We've put together a list of resources that will come in handy if you are ever hit with a security incident or breach, download below.