GSOC15 adventuring

more testing

I re-wrote the function and finally got the unit test working for the open file from url function. I was really pleased with getting everything to work. It got me thinking that I should run tests on the html and javascripts. I was researching web browser tests and might try to install the wptrunner test suite. I also made adjustments to the api documentation. My current priority is to get the javascript/jquery to pop up an input for url. The main author had initially wanted to use the textbox for the url input, but he decided that a separate input would be better than overloading the textbox.

debugging unit test

I've been debugging my unit test. And it was difficult to figure out if the unit test was broken or the code was broken. I got feedback on my code review, and I need to change all my python "tabs" in spaces, which I think I can do globally. I also had to fix the logic of the test, and because a test couldn't run although I wrote it correctly, the failure of the test showed a flaw in the main function. I'll need to rewrite the main function to parse urls with a different order of operations.

troubleshooting aws

The peer code review will be ongoing but I did a consultation with another mentor about making weblogo "Fail Gracefully". To make WebLogo more stable, I'll have to make changes to the build procedure to add a requirements.txt and serve the dependencies from the server. I was told it wouldn't really add that much weight to the application. Also I spent time figuring out the wsgi path error for aws. It looks like i'll have to add some directories that as specific to AWS's elasticbean services to make it work. Even then, I might still need to consult an AWS engineer b/c the WebLogo's use case is very particular.

Unit Test draft

I prepared the function I wrote for a peer code review presentation I will be doing tomorrow. I added documentation to the function elements, and tidied up the function so it looks like "clean" python code. I also wrote of a draft of a unit test for the function. I need clarification on "writing a test that fails" from the main author. I'm a little turned around on how a successful test is one that fails the function. I think I followed the pattern correctly from the Python docs, but choosing the right "assert" function was confusing. Why are coding docs so hard to read? So many words and so little communication of information.

aws trial and unit testing

I re-wrote the open url function as try/except conditions instead of if/else. I researched and used python's standard error library for the error messaging. I also scrubbed the entire WebLogo directory for references to code.google.com, which will become read-only on August 25th. I'm glad the github export was completed by mid-term. However while scrubbing the code, I ran into the developer's guide which outlines how to build new releases with Subversion. The current procedure outlined in the guide would not work with Github's fork/pull or build releases section. It will need to be re-written. AWS has a 12-month trial service, so I tried to deploy WebLogo on AWS servers as a test environment. I tried to follow the Amazon Web Services guides to and failed. The error I'm getting is "Your WSGIPath refers to a file that does not exist." I think this may be due to trying to WebLogo being a .cgi application. I tried adjusting the environment and rebuilding server, but i'm getting a 404 error. I'm researching how to do unit testing and writing a test procedure for the open url function for the WebLogo's test suite.

aws & file checker

I have had open file from url function working under test conditions. The function accesses the url link, validates if it's a file from the internet & raises an exception if it's not. It downloads the file locally, and weblogo can open it from there. I added a file checking procedure to the the open URL function. I researched all the file extensions of formats that WebLogo accepts such as PIR, NEXUS, PHYLIP etc in addition to .txt files. It parses the path of the locally saved file to isolate the file extension and compares the file extension to the acceptable file types, and raises an exception if it the file type isn't valid. I'm reviewing the aws tutorials on how to launch web apps, and will change the dropbox chooser to a download URL checkbox, and the sequence box will accept a URL string instead of sequence data.

deconstructing db url

I tried grabbing the db download url with python's urlparse and opening the file locally with urlretrieve, but I was getting a big useless blob of data b/c db embeds the file into a page from which you can download the file. But you can apparently hack the download link but changing the value in the query part of the of the URL. An example link looks like: https://www.drpbx.com/x/xxxxxxxxxxxxxxx/LICENSE.txt?dl=0, and if you change the zero to a one the it becomes a direct download link that i can open and read in python. I wrote a function that parses the url and rebuilds a new url with the changed value. I'll be testing the new url to open and retrieve a file locally. Also, I'll look into seeing if a similar method would work for a google drive link. I did a check-in the main author who helped me think through the issue. I really appreciate the immediate feedback of a mentorship relationship when facing an unforeseen problem. The assurance that my approach wasn't wrong, that I wasn't crazy, and the ability to think through a way out of the problem with someone is critical for building skills and confidence.

requests module

I was exploring what the python requests module. To validate that WebLogo only downloads files from the internet, I'll use urllib2's url parser function. if the parser detects http, https, or ftp in it's scheme, then we'll lock weblogo out from accessing any local files. I'm getting valid data from the dropbox link, but I when I try to run the link through the requests.get function I get a huge blob of data that doesn't look a text file that I want to open. Dropbox sends the url for the file and presents it in a html & javascript page and puts the content in an iframe. I was trying to sort through the page to grab the data from the iframe. I'll try running the parser and try to isolate the file path, so I can grab just file instead of the entire dropbox presentation page. However, i do have the example page's lightbox module working properly again.

passing urls

8/3 The main author nixed the proposal for an email logo option. he thought that setting an email server would be a pain, but I think it's more painless now. However, he said that an email option leaves the server vulnerable to being used as a spam machine. There's no way to validate that it's a true address. He mentioned that an online download option could be handled with python's url parser instead of dropbox's proprietary sign-in. So that option will be investigated and tested.

8/2 I started researching how to send the logo via email in the python script using the smtplib, and email modules. I wrote draft of a function to send the logo via email based on how WebLogo handles the download. I sent the "psuedo-code" to get the main author's idea on the proposition. I'll keep playing with and testing this prospect.

8/1 The dropbox implementation is proving to be more complicated than anticipated. I had wanted to keep the function separate from the core python, but that looks like it's not possible. Also, Professor Massey suggested that I look into other open source alternatives to dropbox, so I'll be researching those as well.

Download to Dropbox

I started tackling the issue of saving to Dropbox. Saving a weblogo can't solely be done through the front end. There is a checkbox for downloading a weblogo on the web application tool, but the download is handled in the python code by reading the HTTP response. I had a meeting with the main author regarding this issue, and he directed me to the file that listens for the HTTP response and presents the download option. I'll be studying this file to figure out how to get the python to work with dropbox saver's javascript functions.

Syndicate content