Using the Tilde as a Boolean NOT in Pandas

There are lots of times when you want to get the inverse of an operation – like finding all of the rows not like some criteria. This is possible, but hard to search for, especially because this functionality is hiding behind the “tilde (~)” operator.

Here’s how it works:

import pandas

What if you have a column that is supposed to be a date-as-a-number, but your parse-this-column-as-a-date code keeps barfing? Now you can filter by a doesn’t-match-this-format criteria.

myDataFrame = pandas.DataFrame([("20010901"),("18670701"),("10660106"),("20010901"),("Thursday")], columns=["A"], index=[1,2,3,4,5])

Return a dataframe where this format doesn’t match

weirdos = myDataFrame[~(myDataFrame['A'].str.match('\d\d\d\d\d\d\d\d'))]

weirdos is now a DataFrame which includes the rows where it doesn’t match the format, because of the tilde (~) beginning the definition.

I have also used this is get subsets of non-conforming DataFrames – I know what it is supposed to be like, but it is too hard to know all of the ways that your data may not be like that.

I found this when I was looking for the NOT equivalent of .isin – which unfortunately doesn’t exist. That’s the problem with Huffman coding your operators into single characters – you can’t easily search for them if you don’t know what they’re called.


One of the delights of coming back to Python in an intensive way after many years is some of the new ways to do the usual things that I hadn’t learned before.

In this case, I am finding the use of “f-strings” (introduced in Python 3.6, described in PEP 498) to be quite delightful.

print(f"Warning: {sys.argv[2]} exists!")

The previous syntax for format strings always slipped away from my memory, but these seem a lot stickier, and they give me a lot of freedom to just keep writing.

Python is not the greatest language in the universe, but it can allow a lot of fluency that I never experienced in other languages.

Keeping Secrets in Code

The problem of keeping secrets – usernames, passwords, API keys, etc, in code that you write is a pretty old problem. I haven’t had a solution that I liked – especially when I am putting code up on github, for a long time.

Until now. I am putting things like that in a “secrets” file, or in environment variables, which are easy to access from your code, but don’t show up in your code repository. Here’s an example in Python of keeping a “secrets” file that the script can access, and then yoinking its contents into a dictionary for easy reference:

def getSecrets():

def getSecrets():
  with open(SECRETFILE , "r") as scrts:
    return dict(line.strip().split("=====") for line in scrts)

secrets = getSecrets()

api_url = secrets["Test API URL"]
api_key = secrets["Test API Key"]

""" The file itself would look like this:

API Key=====zEV}pF_vn4g35Ye:
API URL=====