This is the first part of Offensive OSINT tutorials which covers preparation (technical and mindset), and presents how to set up a monitoring for Bluekeep vulnerability in hospitals using Shodan and Elasticsearch database.

Introduction

These tutorials will give you insight of OSINT techniques used by cyber criminals to target different organizations and people to commit fraud and hacks. It won’t be another site with bunch of links and description about what they do, but rather technical knowledge for professional security analysts with at least some coding skills and analytical mind. I would like to show each OSINT category– social media, dark web, leaks, industrial/corporate espionage and other similar.

More about Offensive OSINT here

Offensive OSINT - Introduction
I’ve been fascinated about cyber security, especially Open Source Intelligence,for a long time and I made many research in this field. Some of them werepresented on my Medium blog https://medium.com/@woj_ciech. They cover variety ofcyber security topics: Leaks, Industrial Control Systems, Malware…

This first article of the series focuses on proper preparation, i.e. setting up environment, getting couple tools together and install a database. Each presented technique will be supported by research and in this part we will take a look on exposed Remote Desktop Protocols with Bluekeep vulnerability with focus on medical and financial institutions.

In addition to the article, I will show some OSINT & RDP tricks to find particular machines for specific espionage purposes using credential stuffing and reverse image search, so let’s start.

This is a glamour shot of the Commodore PET Mini, a DIY 3D-Printing project to build your very own, functional and cute Commodore PET Mini replica, probably one of the cutest retro-computers ever! Here’s all the info: https://commodorepetmini.com
Photo by Lorenzo Herrera / Unsplash

Preparation

I feel like everything has been said already about mental preparation to OSINT investigations and how you must have analytical mind, document everything and draw relationships. Of course, each assignment requires kind of different thinking but at the end it’s the same approach. Personally, for me top-down approach works best, starting from general idea, divide main goal into smaller tasks and focus on the details at the end.

In this case, main goal is to find vulnerable RDP services in medical/finance institutions that may contain confidential information and no one has found it before. To achieve it, we need to split it into smaller tasks:

- Get information about exposed RDP

- Store the data

- Run periodically checks

- Compare fresh data with database

Clarifying smaller tasks allows to choose specific technologies for the research as well as data sources. We can think right now about technical aspects like what kind of database to use (type, cloud/local), programming language  (including IDE), services or operation system. Having this in mind, we can translate our goals to technical requirements

- Use Python and Shodan API to download information about exposed RDP

- Install Elasticsearch to store the data

- Use cron job for periodic tasks

- Again, use Python to compare new results with database

There are of course more than that, next step is to figure out best Shodan query, periodic time and configure Elasticsearch. It’s just an example, but I used this approach many times during investigations or developing tools.

And now we have everything sorted out and can move to technical aspect.

Technical preparation

This part requires basic Linux terminal and python coding skills but everything will be explained, now we can start from setting up our database.

Elasticsearch is a distributed, open source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.

Documentation makes installation process super easy; first we need to download and install signing key

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

Install https-transport

sudo apt-get install apt-transport-https

Save definitions

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

and install it from repository

sudo apt-get update && sudo apt-get install elasticsearch

to fire it up

sudo systemctl enable elasticsearch.service –now

for test go to

http://localhost:9200

Configuration and further management you can find here

https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html

Python is one of the best language for this kind of scripting tasks and good IDE can speed up whole developing process and make our life lot easier. It’s not like people coding in notepad are not efficient but IDE helps a lot to freely move around the code and manage your repository in a quick way. My favorite one is Pycharm, it creates separate virtual environment for each project, supports coloring, multiple terminals, local history and lot more features.

What you should know about this IDE are for sure keyboard shortcuts. Sometimes, I don’t touch my mouse for hours during work with Pycharm, everything can be managed by specific shortcuts and it makes work extremely easy. I choose couple the most useful for daily basis.

2x shift – search everywhoere
Ctrl + / - Comment text
Ctrl + Shift + / - uncomment text
Shift + f10 – run
Shift + f9 – debug
Ctrl + r – Replace/Find
Ctrl + h – history
Alt + f12 – open terminal

You can find all of them on Jetbrains website

JetBrains: Developer Tools for Professionals and Teams
JetBrains is a cutting-edge software vendor specializing in the creation of intelligent development tools, including IntelliJ IDEA – the leading Java IDE, and the Kotlin programming language.

OSINT & Bluekeep

Let's import necessary packages to our project

from elasticsearch import Elasticsearch
import requests
import json
import email.message
import smtplib

First we need to create index called rdp-monitoring to where we will keep all the data.

es.indices.create(index='rdp-monitoring', ignore=400)
Pycharm (left) Elasticsearch (right)

and now we can obtain RDP info from Shodan.

Couple things that we must include

SHODAN_API_KEY = ""
query = "port:3389 org:hospital"
endpoint = "https://api.shodan.io/shodan/host/search?key="+SHODAN_API_KEY+"&query="+query+"&page="
cve = "CVE-2019-0708"
fresh = []

Fresh is a new empty list that will contain new (fresh) IP addresses.

And code for extracting hosts vulnerable to Bluekeep

try:
    shodan_request = requests.get(endpoint)
    shodan_json = json.loads(shodan_request.text)
    for result in shodan_json['matches']:
        if not exists(result['ip_str']):
            check_each_host = requests.get("https://api.shodan.io/shodan/host/" + result['ip_str'] + "?key=" + SHODAN_API_KEY)
            check_each_host_json = json.loads(check_each_host.content)
            if 'vulns' in check_each_host_json:
                if cve in check_each_host_json['vulns']:
                    fresh.append(result['ip_str'])
                    print("New IP:" + result['ip_str'])
                    es.index(index="rdp-monitoring", id=check_each_host_json['ip_str'], body={"organization": check_each_host_json['org']})
        else:
            print("IP exists")

    if fresh:
        send_notification(fresh)

except Exception as e:
    print(e)

It makes an API request to Shodan with query „port:3389 org:hospital” (I haven’t found precise dork for Bluekeep), iterates over the results and then makes another request to examine each host for CVE-2019-0708. Shodan does not always return vulns in general search results so we must check each IP separately (check_each_host).

After that we check if there are any vulnerabilities (vulns) and if this list contains our CVE-2019-0708. If yes, we are adding it to Elasticsearch.

es.index(index="rdp-monitoring", id=check_each_host_json['ip_str'], body={"organization": check_each_host_json['org']})
I’m using IP address as ID field,  I’m doing it for simplicity and easier search without using source filtering.

There are around 70 results, it checks only first page, you can write code to paginate for more results.

So far, we have saved IP addresses and related organizations to the Elasticsearch, but if script will run next time it must recognize which entries are not in database yet. To do that we can write additional function "exists”, that is use on line 5 in above snippet.

def exists(ip):
    try:
        es.get(index='rdp-monitoring', id=ip)
        return True
    except:
    	return False

It just tries to retrieve document based on index and ID which is in our case IP address. If entry does not exist it throws a 404 error, that how we can „search” without filters.

The last part is to being up to date and get a mail each time new finding appears, to achieve this we need to write another function which uses a Gmail account to notify us about new machines.

def send_notification(ips):
    body = "<h1>New IPs in hospitals with Bluekeep</h1><br>"
    ips_text = ""
    for ip in ips:
        ips_text = ips_text + "https://beta.shodan.io/host/" + ip + "<br>"
    
    msg = email.message.Message()
    msg['Subject'] = 'RDP Monitoring'
    msg['From'] = "@gmail.com"
    msg['To'] = "@gmail.com"
    msg.add_header('Content-Type', 'text/html')
    msg.set_payload(body + ips_text)
    gmail_user = "@gmail.com"
    gmail_password = ""        
        
        
    try:
        server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
        server.ehlo()
        server.login(gmail_user, gmail_password)
        server.sendmail(msg['From'], [msg['To']], msg.as_string())
        server.close()
        print('Email sent!')
        except Exception as e:
        print(e)
        print('Something went wrong...')
    pass

That’s the simplest code found somewhere online and modified to support text/html content. Basically, first we create a informational text with our fresh IPs (saved in fresh list) and send it to specific email address. You have to put your email and password in the code and also login to your gmail account in browser from this same machine to make it work.

After run it, new records will appear in Elasticsearch and it should look like this

Working example for hospitals

You can put the script to cron, add it to your monitoring or combine with other in your arsenal.

Email from send_notification function
Not only hospitals are precious target, query „port:3389 org:bank” will list you all RDP on port 3389 in financial institutions, which have „bank” in their name. Going step further, using free text search and word like „scada” will show potential Industrial Control Systems and compromising this box are priceless for some threat actors.

Credential stuffing & RDP

Credential stuffing usually reminds of accessing other services due to compromised victim's credentials. In most cases after massive password dump, cybercriminals test this same pair of username and password in other popular platforms like Facebook, Spotify, Netflix etc. It was the subject of my last year research, you can read more about credential stuffing below.

Advanced credential stuffing with PEPE
Script parses Pastebin email:password dumps and collects information about eachemail address. It supports Google, Trumail, Pipl, FullContact andHaveIBeenPwned. Moreover, it allows you to send an informational mail to personabout his leaked password. At the end every information is stored inElast…

Remote Desktop Protocol is not different, if someone uses same password in every service he must be aware that it affects his every Internet personality and account, even the most personal ones like access to personal computer via exposed RDP. Shodan uses OCR (Optical Character Recognition) to extract text from images, it includes usernames and windows version in some cases.

When you search for RDP for specific company and had no luck in their network, don’t forget to use free text search method, machine might be in a cloud and exposes organization name in username field or security warning.

There are two tactics used by APTs to gain access to the network -  targeting personal without technical and security IT knowledge like HR departments to execute their hidden malicious payload without being spotted and then elevate permissions.

Targeting high profile individuals (CEO, CISO, Senior Management) to almost automatically gain access to confidential documents, depends on victim permissions. Often cyber criminals send phishing emails to high management like senior analyst and engineers, in precise sectors – medical, government, education or engineering but only in targeted attacks and using spear phishing.

HaveIBeenPwned & RDP

To check if some RDP is worth to hack based on the exposed username we need to fire up OSINT & email skills.

Email address as username

First, lets find some RDP with email as account, you can use query „port:3389 has_screenshot:true 'gmail'” and change gmail to any other email provider or company. Unfortunately, from unknown reasons, you can't search for "@" character which would give all potential email address.

Following code extracts gmail email addresses from exposed RDPs

query = "port:3389 has_screenshot:true gmail"
endpoint = "https://api.shodan.io/shodan/host/search?key="+SHODAN_API_KEY+"&query="+query+"&page="

req = requests.get(endpoint)
req_json = json.loads(req.text)

for match in req_json['matches']:
    text = match['opts']['screenshot']['text'].split("\n")
    for line in text:
        if "@" in line:
            print(line)

Quick search for one of the email address from returned list (and screenshot above), reveals it was compromised in 7 different breaches.

That’s a lot, and even if password were unique but not randomly generated one can find the scheme and guess his access to the RDP.

But is it worth to exploit this machine for any confidential material? We need to investigate who is the owner of email address. Quick Google search reveals that he is an [REDACTED] engineer and graduated from [REDACTED].

Publicly accessible resume includes email form RDP

That's one way how you can target owner or account of exposed RDP machines, it includes manual search from couple reasons - OCR is not precise and sometimes contains bizarre characters, email includes "..." instead of top level domain but that's because RDP does not show full username if it's inactive. At the end findings have to be manually reviewed in order to determine potential valuable targets.

Reverse Image search & RDP

That’s also uncommon technique that I haven’t heard anyone use, it’s kind of tedious work but might pay off and confirm end doubts. If machine does not reveal any email address and it’s not linked to any organization, user profile picture can help you to determine owner of the machine and potential value in terms of classified documents or access.

What we need is to cut off the profile image from RDP screenshot and run it against one of the reverse image search service.

In this case I had to check couple services for different results. People often, same as passwords, reuse their profile picture among different services, especially LinkedIn. It’s one of the method to track the users but this time help to narrow our search to specific targets.

Same Twitter profile picture and RDP account above
Example results of reverse image search of face from RDP account

RDP usernames in first and last name format are easier to investigate, you have to search social media accounts and Internet mentions for this person, however name and first name are not always unique, but profile pictures are. In these two cases pictures were reused across many social media platform so it was easy to identify that persons behind these accounts as Senior Product Owner and CEO of financial company.

Conclusion

OSINT is used widely not only by law enforcement and researchers but also cybercriminals. By cybercriminals I mean actors that target specific people and companies to gain as much profit as possible - confidential documents, clients data, intellectual property or just money. Knowing theirs techniques you can defend yourself and not be an easy target, at least.

For researchers, monitoring exposed Bluekeep machines allows to take a look on next potential compromised companies and predict newspapers headlines.

Please subscribe for early access, new awesome things and more.