Drop Pandas DataFrame columns only if they exist

TL, DR Sometimes you need to drop columns in a Pandas DataFrame, but you may not be sure they actually exist. Here a few snippets to do it without raising errors. Safely Dropping Columns That Might Not Exist Sometimes, your dataset might include placeholder or optional columns such as “Unknown”, “Other”, or “None”. You may… Continue reading Drop Pandas DataFrame columns only if they exist

Calculate metrics on numeric columns only for Pandas DataFrames

TL, DR Sometimes you need to calculate averages, minimums, or other metrics from numeric-only columns in a Pandas DataFrame. Here a few snippets to do it. Select numeric columns and calculate metrics You often want to calculate summary statistics like the mean , median , or standard deviation — but you only care about numeric… Continue reading Calculate metrics on numeric columns only for Pandas DataFrames

Fix Flatpak browser not opening links in Ubuntu

TL, DR Sometimes Ubuntu stages a coup, setting the Snap internet browser as default. This may break your setup if you actually instralled your browser using Flatpak. Here a few steps to fix this issue. Flatpak vs. Snap Ubuntu has a strong preference for Snaps…but maybe you don’t necessarily agree with this and you prefer… Continue reading Fix Flatpak browser not opening links in Ubuntu

Use Python Selenium with Snap browsers

TL, DR Selenium is one of the main libraries for browser automation and web scraping. Sometimes it is painful to integrate with browsers that are installed as Snap packages, this guide gives you a few examples for a correct configuration. What are Snap packages? Snap packages is the new and preferred way to distribute applications… Continue reading Use Python Selenium with Snap browsers

Set your User Agent with Python Requests, Scrapy, and Selenium

TL, DR When you crawl the web to collect data you should set a User Agent that identifies you. Or one that hides the tool you are using. Here you can find how to set the User Agent in Python Requests, Scrapy, and Selenium. What is the User Agent? A User Agent is a string… Continue reading Set your User Agent with Python Requests, Scrapy, and Selenium

MongoDB aggregation: save results in a Pandas DataFrame

TL, DR MongoDB is one of the leading NoSQL databases, and its aggregation framework enables powerful queries, as well as data operations. We will see how to save results from aggregation pipelines into a Pandas DataFrame. From MongoDB to Pandas I already provided an introduction to MongoDB and Compass in a previous post for my… Continue reading MongoDB aggregation: save results in a Pandas DataFrame

MongoDB aggregation: match a field with values in a list

TL, DR MongoDB is one of the leading NoSQL databases, and its aggregation framework enables powerful queries, as well as data operations. We will see how to match a field with values in a list to help you select. Matching values in a list I already provided an introduction to MongoDB and Compass in a… Continue reading MongoDB aggregation: match a field with values in a list