• Started designing how the TextMessageFilter feature will be implemented. Since it is a stream-stream gRPC call, it will operate very similar to authenticators.

Tomorrow: implement the above mentioned feature.

deconstructing db url

I tried grabbing the db download url with python's urlparse and opening the file locally with urlretrieve, but I was getting a big useless blob of data b/c db embeds the file into a page from which you can download the file. But you can apparently hack the download link but changing the value in the query part of the of the URL. An example link looks like:, and if you change the zero to a one the it becomes a direct download link that i can open and read in python. I wrote a function that parses the url and rebuilds a new url with the changed value. I'll be testing the new url to open and retrieve a file locally. Also, I'll look into seeing if a similar method would work for a google drive link. I did a check-in the main author who helped me think through the issue. I really appreciate the immediate feedback of a mentorship relationship when facing an unforeseen problem. The assurance that my approach wasn't wrong, that I wasn't crazy, and the ability to think through a way out of the problem with someone is critical for building skills and confidence.

Noseyparker - 5 August

I collected a couple of application log samples from the server and processed them using the updated report code. There were a few bugs which I fixed. The reporting code seems to be reasonably robust - however more testing is still required.

While collecting the application logs I checked the performance of the MitM server. With just the PII handlers running the CPU usage peaked below 15%, which is pleasing given the large amount of processing occurring and the modest server specifications. Even with a low handler frequency (running handlers every 1 in 5 requests) there is still occasional timeouts. This is most likely caused by the latency of my cloud based setup.

5 August

Have been running tests with VTune this week to try to understand how my code is interacting with my CPU and my cache. I have a meeting with David set up tomorrow to discuss the results and talk about where I can head from here.

Josh Leverette's picture

Designing a filter

At this point, I believe that I'm just trying to figure out how to design one of these filters, practically. I've started looking around at tutorials on particle filter design to attempt to get a better grasp on that side of things. For the short amount of time that I've been studying them, I believe that I have a pretty decent grasp of the theory behind a particle filter. Bridging the gap to actually designing my own is an interesting challenge.


  • Added a bit of safety to context actions. Before this, any user could trigger a context action and it would be sent to the ContextActionEvents stream. Now, users will only be able to trigger context actions that have been explicitly added to them. This is similar to how the Ice system works, so I am glad I have this implemented.
  • Added support for TLS secured connections (for murmur as well as murmur-cli).

Tomorrow: implement more wishlist features.

requests module

I was exploring what the python requests module. To validate that WebLogo only downloads files from the internet, I'll use urllib2's url parser function. if the parser detects http, https, or ftp in it's scheme, then we'll lock weblogo out from accessing any local files. I'm getting valid data from the dropbox link, but I when I try to run the link through the requests.get function I get a huge blob of data that doesn't look a text file that I want to open. Dropbox sends the url for the file and presents it in a html & javascript page and puts the content in an iframe. I was trying to sort through the page to grab the data from the iframe. I'll try running the parser and try to isolate the file path, so I can grab just file instead of the entire dropbox presentation page. However, i do have the example page's lightbox module working properly again.


It's a holiday here in Canada, so what I was planning for today has been moved to Tuesday.

Noseyparker - 4 August

I finished making updating the PII data report JSON report to handle recent application log chaanges. The report is looking good now and and for each online domain shows: - PII disclosed over unencrypted (non-HTTPS) connections - PII disclosed over encrypted (HTTPS) connections - Query-string key/value pairs occuring for multiple requests over unencrypted (non-HTTPS) connections

Note. although some pairs are anonymous when they persist across multiple requests that could allow user tracking.

Here is a sample of the PII data report -

I ironed out a few bugs in the report. Tomorrow I will process more application log samples and fix issues I found.

Syndicate content