My GSoC Adventure

Noseyparker - 8 June

Spent part of today testing the code I have written for detecting PII in mobile app http requests - in query strings and (non-https) request headers. The test cases I could run passed, however I had to add some better error handling for exceptions. I couldn't find apps to test all cases - however I simulated some by browsing sites in the Android Chrome browser. I'll need to think what test cases I should add to the existing nogotofail Android testing harness (app).

The next stage of the project is detecting pii in http request bodies. The author told me the current app doesn't handle compressed request bodies (e.g. using gzip). I spent the afternoon investigating the current http libraries being used, and detecting compressed content. I also started to look at different formats request bodies could use e.g. json, xml.

Noseyparker - 5 June

I have been trying to find some Android mobile apps I can use to test different PII leakage scenarios. Some of the apps I looked at few months ago have stopped leaking information (which is great!), however it doesn't help with my testing.

I used an SSL proxy I setup (mitmproxy) to inspect traffic for a variety of apps, and found some I can use to test a number of cases. Along the way, I found some apps I have been using that haven't triggered expected event handlers in my code - I will look closer at why these didn't trigger.

Spent some more time tidying up some earlier code as well.

Noseyparker - 4 June

I decided to use the name of my project in the blog title ("Noseyparker") to make it more descriptive. Today was a productive day of coding - I was able to combine PII read from the Android client with PII specified by the user into a single collection. It was tricky finding the best place in the project to do this.

Along the way, I found that the place I had been using to create the PII collection was called many times (each http connection setup), potentially causing a large impact on processor and memory resources. I moved this code to an object that should only be called once for each Android client setup.

I compared the Google Compute Engine server stats after the change with earlier stats - and found peak processor usage dropped significantly, from 14% to 10%. Unfortunately there are no memory stats available.

3 June GSoC

I'm finding working on this open-source project interesting, but different to other projects I've worked on in previous jobs. In other projects, I have worked side by side with the authors in the workplace, or been involved with the design of the application from the start.

Working on another persons project without having them beside me, I find I learn more about the application design every day. I've noticed I'm often refactoring my code to make it fit with in "my current understanding" of the design.

Today was a day I spent refactoring my code. It now is looking more streamlined. But unfortunately, I don't feel I have made a lot of progress today ... Well 2 steps forward one step back I guess :)

2 June GSoC

Today was productive. I added logic to the PII detection event handlers on the server side to trigger when device location (longtitude & latitude) is disclosed in unencrypted traffic. Also I spent time tidying and simplifying code.

I consider myself to have a basic level of Python language knowledge and have run into a couple of issues I wasn't aware of e.g. formatting a floating point number to 2 decimal places as a string, doesn't truncate but round! I did some reading on Python string and mathematical operations.

1 June GSoC

I made changes to the Android client to read more device information such as the WiFi MAC address, Device ID and current location.

Retrieving the location was tricky - sometimes the location was returned, other times it wasn't. Also initially I didn't have enough error handling to stop the app from crashing when the current location wasn't available. With the help of Stack Overflow (again!) I worked my way through it... mostly ... enabling location history and going to google maps to obtain my location (after device reboot) was my short-term fix.

I need to further investigate the most reliable method for retrieving a devices location to make sure the client is reliable.

29 May GSoC

Today I worked on the 1st prototype. I used some code I previously developed from my Masters project to add to nogotofail, and modified it to create a number of basic functions:

  • Changes to the Android client to read device IDs (Android ID, Google Ad ID)

  • Server event handlers for PII in plain-text (unencrypted) traffic for request query strings and http page headers.

  • New notification messages on server and client for PII event handlers.

In the tests I ran this code seemed to mostly work as expected. However one of the events didn't trip every times I expected. Also code could be optimized a lot.

Need to do more testing to shake the bugs out.

28 May GSoC

Unfortunately time got away from me studying for an exam and I didn't get a chance to blog.

Over the past two days I have:

  • Looked for ways to reduce the size the of pii sets searched in http requests. Using mitmproxy I examined a sample of mobile apps to look for http header items which don't contain pii. I was able to find alot which aren't intended to contain pii. i.e. content-length, accept-language. Discarding these will make searching much more efficient.

  • A decision I need to make is whether I should - search (one) long query string text-string for pii; or break down the query string elements into a list and search. It's still not clear which is the best option. More research, researching and (possibly) code profiling is needed.

26 May GSoC

My project will be using the Python Dictionary type alot - to store collections of key/value pair, as well as collections of collections (key/value pairs).

A common operations on dictionaries is likely to be set intersection & union and element sorting. The number of elements in a dictionary could be reasonable (up to 10 elements) - so efficiency is important.

I searched blogs and stack overflow and a lot of people suggested the use of Dictionary Comprehensions.

I spent time experimenting with comprehensions in iPython (interactive Python shell) to better understand them.

25 May GSoC

I have used mitmproxy as an SSL proxy before to inspect mobile application traffic - it would be great for my project to validate results and explore. I have it installed on my Raspberry Pi configured as a wireless access point. Unfortunately Raspbian support for WiFi adapter monitor mode isn't great, and almost every time the kernel is upgraded I need to re-compile the adapter (cross-compile) ... painful :( With the help of a member of the mitmproxy forum I was able to set it up on a Google Compute Engine Debian instance and it works great!

Spent some time reading about Python list comprehensions, which seem to be more efficient and readable than normal for loops when working with lists.

Syndicate content