Switching Your GTK Theme Based on Time of Day
Using systemd timers to automatically switch between light/dark GTK themes in a GNOME desktop. This may not be a data science post, but knowing systemd timers can certainly be handy for many applications where we want to run scheduled jobs on a Linux workstation.
Web-scraping AbeBooks.com (Reverse Engineering a REST API)
Many times we are faced with obtaining data from websites that do not have a documented REST API. In this post, I analyze POST and GET requests from AbeBooks's network packets, and build a Python API wrapper for programatically obtaining book prices and recommendations.
Interweaving R and Python with Reticulate
I love Python for general programming and data manipulation... but R has amazing statistical libraries. Do you also wish you could combine both? Here is a small demonstration of how acess Python objects from R using the Reticulate library.
Machine Learning Methods for LogP Prediction: Pt. 1
The octanol-water partition coefficient, or logP, is one of the most important properties for determining a compound’s suitability as a drug. Unfortunately, currently existing models are not as accurate as we would like. This experiment compares the performance of several different molecular fingerprinting methods, paired with three common machi...
Mining Pharos with MySQL and Python
Accessing SQL databases with Python can be useful in many situations. Here, we use MySQL Connector and Python’s Pandas library to retrieve and manipulate data for Pharos targets. The goal is to obtain a dataset of targets that contain more than 15 active compounds, along with information about their different target classes.
Mapping Pharos Targets to PDB Structures
The SIFTS (Structure Integration with Function, Taxonomy and Sequence) database provides mappings between UniProt genes and PDB structures, among other things. Using these mappings, we count the number of targets from the NIH's Pharos database which have ligands with known binding affinities, and at least one structure in the Protein Data Bank.