Luminoso Blog » python

Image may be NSFW.
Clik here to view.

Fixing common Unicode mistakes with Python â€” after they’ve been made

August 20, 2012, 9:43 am

Update: not only can you fix Unicode mistakes with Python, you can fix Unicode mistakes with our open source Python package ftfy. It’s on PyPI and everything. You have almost certainly seen text on a...

View Article

Image may be NSFW.
Clik here to view.

Fixing Unicode mistakes and more: the ftfy package

August 24, 2012, 12:44 pm

There’s been a great response to my earlier post, Fixing common Unicode mistakes with Python. This is clearly something that people besides me needed. In fact, someone already made the code into a web...

View Article

Image may be NSFW.
Clik here to view.

How to make an orderly transition to Python Requests 1.0 instead of running...

January 4, 2013, 8:29 am

There’s a lovely Python module for making HTTP requests, called requests. We use it at Luminoso. A bunch of code we depend on uses it. Our API customers use it. Basically everyone uses it because it’s...

View Article

Image may be NSFW.
Clik here to view.

ftfy (fixes text for you) 4.0: changing less and fixing more

May 21, 2015, 1:52 pm

ftfy is a Python tool that takes in bad Unicode and outputs good Unicode. I developed it because we really needed it at Luminoso — the text we work with can be damaged in several ways by the time it...

View Article

Image may be NSFW.
Clik here to view.

wordfreq: Open source and open data about word frequencies

September 1, 2015, 8:20 am

Often, in NLP, you need to answer the simple question: “is this a common word?” It turns out that this leaves the computer to answer a more vexing question: “What’s a word?” Let’s talk briefly about...

View Article

Image may be NSFW.
Clik here to view.

wordfreq 1.2 is better at Chinese, English, Greek, Polish, Swedish, and Turkish

October 29, 2015, 12:29 pm

Examples in Chinese and British English. Click through for copyable code. In a previous post, we introduced wordfreq, our open-source Python library that lets you ask “how common is this word?”...

View Article