JSON Data
What we will cover…
- What is JSON
- Loading JSON in Python
- Opening files in Python
- JSON lines
What is JSON
JSON stands for JavaScript Object Notation.
Javascript is a language, similar in many ways to Python, that was created to run in web browsers.
The equivalent of a dictionary in python is called an “object” in javascript.
By “notation” it is simply meant: the syntax of writing an object in javascript.
What is JSON
Thus, you can think of JSON as a way to represent dictionaries (or lists) as a string.
Why do we need this?
Sometimes we want to share data between programs and computers, even across different programming languages. So we need some way to turn a complex data structure (a dictionary) into a string for sending around.
JSON Syntax
{
"id": 58726,
"title": "Hello World",
"author": null,
"isCool": true
}
JSON syntax looks an awfule lot like the syntax for python dictionaries, so it should be easy to read.
Some noteworthy differences:
- Only double quotes
""
. - Booleans are lowercase (
true
). - Missing values are
null
.
Deserializing
my_json = '{ "id": 58726, "author": null }'
import json
dat = json.loads(my_json) # a dictionary!
dat['author'] # None
JSON, as we mentioned, is a string.
To access the data in python, we need to parse this string (also known as deserializing) as a python dictionary.
We can do that with the built-in json
module.
Files
.
├── foo.py
└── hello-world.json
JSON, as a string, can easily be written to files. This can be a convenient way to store data in a file!
JSON files usually have the .json
ending.
Let’s try and read the json from this json file!
Files
with open('hello-world.json') as f:
# do something with f
To open files in python, we use the builtin open
function.
However, it’s extremely important to close any file we open. To prevent from forgetting to do that, we use a context manager: with
.
with
provides us with a block of code (indented, as always!) with a new variable: f
which represents our file. When the block is finished, f
is automatically closed.
Files
with open('hello-world.json') as f:
contents = f.read()
print(contents)
# prints contents of file
The file object returned by open
is very easy to use:
f.read()
returns the contents of the file as a string.
Files
with open('hello-world.json') as f:
for line in f:
print(line)
# prints each line in the file
The file object returned by open
is very easy to use:
f.read()
returns the contents of the file as a string.f
is also an iterable, with each element of the iterable representing one line of the file as a string
JSON Files
Sometimes, in .json
files, the entire file is JSON and can be read and parsed.
Othertimes, the files consists of multiple lines, and each line in the .json
file is json, which should be read and parsed per line. This is a json lines file.
In data analysis, we more often see the second type of file, so we often deal with the lines.
JSON Lines files
with open('hello-world.json') as f:
lines = [line for line in f]
# Now we can use lines
import json
for line in lines:
dat = json.loads(line)
print(dat)
The file object returned by open
is very easy to use:
f.read()
returns the contents of the file as a string.f
is also an iterable, with each element of the iterable representing one line of the file as a string
Review
- What is JSON
- Loading JSON in Python
- Opening files in Python
- JSON lines