How to Upload With the Python Youtube Data Api
Part 1: Using YouTube'due south Python API for Information Science
A unproblematic technique for searching YouTube'due south vast catalog
Last week, I wrote a quick guide to using Google's Oral communication Recognition API, which I described equally "kinda-sorta-actually confusing." Well, gauge what? Their YouTube Information API isn't super clear either.
If yous're an practiced coder with decades of experience, you can probably cease reading right now. Withal, if yous're an intermediate or self-taught developer (similar myself), this guide should assist provide a jump-start to using YouTube for data science.
In the involvement of time and overall readability, I've decided to pause this tutorial into multiple parts. In part one, we will focus on getting the API library installed and authenticated, in addition to making simple keyword queries.
The parts to come will focus on other types of tasks, similar annotate collection, geographic queries and working with channels.
To complete this tutorial, you'll need the following tools:
- Python two.7
- A Google account
Getting started
ane) Clone the GitHub repository
git clone https://github.com/spnichol/youtube_tutorial.git
cd youtube_tutorial 2) Installing YouTube Official Python Client Library
pip install --upgrade google-api-python-client 3) Activating YouTube API
Note:In order to consummate this step, yous need to accept a "project" created. If you're unsure how to do this, cheque out my Spoken communication Recognition tutorial for a quick guide.
Head over to the Google Cloud Console, click the hamburger card on the top left-hand corner and select "API Manager".
Select "YouTube Information API" from the YouTube APIs.
Click "Enable"
Click "Credentials" from the left-manus navigation panel and select "Create credentials." Yous'll want to choose API central from the drop-down listing. You lot should see a message that says "API key created" with the alphanumeric API key. Copy and relieve this in a safety place!
Setting up our Python scripts
Open youtube_videos.py in your code editor. This is a heavily modified version of YouTube's sample code.
Replace the DEVELOPER_KEY variable on the 5th line with the API key we created earlier and save the file.
DEVELOPER_KEY = "REPLACETHISWITHYOURKEY_GOTIT?" Absurd, cool, absurd. We're gear up to first writing the code to practice our keyword queries. Go ahead and create a new python script and relieve it in the same directory.
Setting up our Python scripts
We want to exist able to use the functions in the youtube_search.py file where we saved our API key. 1 way to practice this is by appending that directory to our Python path.
1) Suspend directory to Python path
import sys
sys.path.suspend("/home/your_name/youtube_tutorial/") 2) Import youtube_search function
Now that we accept our path fix, we can introduce the youtube_search office from the youtube_videos.py file with a elementary import statement.
from youtube_videos.py import youtube_search Let's besides import the json library, since it will come up in handy when parsing the JSON output of the API.
import json Keen. We're ready to try a quick keyword search with our youtube_search function.
four) Test out youtube_search function.
Let'due south assume we dear fidget spinners (we don't) and we desire to meet what videos YouTube has well-nigh them.
test = youtube_search("spinners")
test Your output should expect like this:
The output of our youtube_search is a tuple of len = ii. The showtime detail is some weird half-dozen-character string. The second item is a bunch of JSON. Allow'due south ignore the weird six-character string for a bit and select merely the JSON.
just_json = exam[1]
len(just_json) Now you should take this:
Now we can meet nosotros have a JSON object of len = fifty. Each detail is a YouTube video with details about that video, such equally the ID, title, appointment published, thumbnail URL, duration, etc.
Let's say we want to get the title of a video, we tin easily loop through and parse it with the JSON library.
for video in just_json:
print video['snippet']['title'] This gives us:
Fidget spinner pizza? Huh?
Ok, onward. Now, we figured out earlier that our output consisted of 50 distinct videos. That'due south swell and all — certainly mode more than videos about spinners than I'm interested in watching — merely, that couldn't be all of them, right?
Correct. If we desire more than 50 videos, which is the maximum we can arrive ane request, we have to ask for another round of 50. We can do this past using that weird half dozen-character string nosotros saw before … also known as a token.
If we send the token with the next request, YouTube will know where it left off, and send u.s. the next 50 most-relevant results. Permit's try.
token = exam[0]
youtube_search("spinners", token=token) Now we have:
Crawly. L more videos well-nigh spinners. Too, if you noticed, we got another token. If we want to go along going, we just rinse and repeat with that token.
Obviously doing it this way would actually suck. Instead, we tin can write a function to automate the process a bit.
Automate the procedure
- Instantiate a lexicon to store results
Before we write our function, allow's instantiate a dictionary outside of it with the variable names nosotros want to save.
video_dict = {'youID':[], 'title':[], 'pub_date':[]} ii) Define our part
We can name our function grab_videos and add two parameters. The first is the keyword, while the 2nd is an optional parameter for the token.
def grab_videos(keyword, token=None): 3) Add youtube_search and save our variables
Merely like before, we'll do a elementary search with the youtube_search function and save our token and JSON results separately.
res = youtube_search(keyword)
token = res[0]
videos = res[one] 4) Loop through results, append to lexicon & render token
for vid in videos:
video_dict['youID'].suspend(vid['id']['videoId'])
video_dict['title'].append(vid['snippet']['championship'])
video_dict['pub_date'].suspend(vid['snippet']['publishedAt'])
print "added " + str(len(videos)) + " videos to a full of " + str(len(video_dict['youID'])) return token
I also added a fiddling impress statement hither to update u.s. on the number of videos we have nerveless each time the function is called.
Finally, we render the token and so we tin use it the next time we call the function.
five) Phone call function with while statement
Now that we've written the office, we can write a few short lines of code to put it to work.
First, nosotros'll telephone call the function with our keyword, saving the results to the variable token.
We tin can then utilize a while statement to bank check if the part returned a valid token (i.east. there are more than results) or if we've hoovered up everything. If it sees "last_page", it will stop the execution of the code.
token = grab_videos("spinners")
while token != "last_page":
token = grab_videos("spinners", token=token) Output should expect similar this:
Decision
Alright, at present yous're at to the lowest degree in office ready to start doing your own data collection on YouTube.
If this commodity was helpful, please allow me know via eastward-mail or in the comments section. If not, I'll take up another hobby! Stayed tuned for parts two and three.
wunderlichwourethe.blogspot.com
Source: https://towardsdatascience.com/tutorial-using-youtubes-annoying-data-api-in-python-part-1-9618beb0e7ea
0 Response to "How to Upload With the Python Youtube Data Api"
Post a Comment