Paul Larsen

Freelance Data Scientist, PhD Mathematics, Rhodes Scholar. I help you succeed with nearly all things data and AI.

Home

fake-data-for-learning on PyPI

Published Apr 17, 2020

My package-in-the-making fake-data-for-learning to create interesting fake data for machine and human learning is ‘in-the-making’ no more! Now you can start using it from PyPI with pip install fake-data-for-learning.

The Easy Part

I developed the package with setup.py from the beginning, via the -e . line in the requirements.txt. The actual packaging worked exactly as documented in the official tutorial.

The Tweaks

The modifications to my previous code for PyPI fell into two categories, first how to make the PyPI landing page look decent, and second, topics I uncovered or finally got around to while preparing for packaging.

Cosmetic

To make the PyPI landing page for my project look decent, I started by using the setup.py boilerplate from cookiecutter-pypackage, including parsing of your README, which is then assigned to long_description.

When rendered on PyPI, however, all relative links were broken. I tried making them absolute to my GitHub repo, but then decided to violate DRY and just copy-paste a short description into setup.py.

The new and the procrastinated

While reading packaging documentation, I noticed that pytest-runner has been deprecated due to security vulnerabilities. The recomendation is to use something else, like tox.

I first head of tox a few years back when starting with cookiecutter-pypackage, but had put off doing more than reading the welcome page.

When using tox, I quickly discovered I was guilty of all sorts of PEP sins. I had been using pylint in Visual Studio Code, but somehow didn’t catch as much as tox did (or maybe I was just better at ignoring it). I also installed the cornflakes linter, which made my style faux pas more obvious.

Wrapup

Generating interesting fake data is now even easier with pip install fake-data-for-learning.

If you like fake-data-for-learning, please star it on GitHub. Any issues, then just add an issue.