diff options
| -rw-r--r-- | blog/youtube_to_rss.md | 128 |
1 files changed, 99 insertions, 29 deletions
diff --git a/blog/youtube_to_rss.md b/blog/youtube_to_rss.md index c96e31f..89bd3d7 100644 --- a/blog/youtube_to_rss.md +++ b/blog/youtube_to_rss.md @@ -1,54 +1,124 @@ -# Convert your Youtube subsciptions to RSS feeds +# YouTube subscriptions to RSS feeds -> Fun fact: I'm such a zoomer that I only discovered RSS recently (which is [years] old). +I spend *waaaaaaaaaaaay* too much time on YouTube, +this is mainly due to the recommendations (which are freakishly accurate). +I generally go on my subscriptions page then get baited into watching garbage content from the recommendations. +Having RSS feeds for all my subscriptions will allow me to have more control over the content I consume. +The videos will be streamed directly to my computer to avoid going on the website. -## Get all subscription +> Fun fact: I'm such a zoomer that I only discovered RSS recently, which is older than me (1995). -You probably could do some fancy stuff with the google API but it would probably be overkill. -Go to <https://youtube.com/feed/channels>, scroll to the bottom (Good luck if you got 1000+, your middle finger wight get cramped). -Right click on something that is not a link or an image and select `Save as`, give a name to the file and save it. +## Get all subscriptions -## Parse list of subscription +You could do some fancy stuff with the [YouTube API](https://developers.google.com/youtube/v3/docs/subscriptions/list) but it would be overkill. +Instead go to <https://www.youtube.com/feed/channels>, scroll to the bottom +(Good luck if you got 1000+ subscriptions, your scrolling finger wight get cramped). +Right click on something that is not a link or an image and select `Save as`, +give a name to the file (`channels.html` in my case) and save it. -Get all the channels urls, replace `channels.html` with the HTML file you saved. +### Parsing -``` -grep -o -E 'href="https://www.youtube.com/(c|channel|user)/[a-zA-Z0-9 ]+"' channels.html | +The following command will retrieve all links to YouTube channels in the file we just saved. +Replace `channels.html` with the name of the HTML file you saved. + +```sh +grep -o -E 'href="https://www.youtube.com/(c|channel|user)/[a-zA-Z0-9 ]+"' \ + channels.html | sort | uniq | sed 's/href="\(.*\)"/\1/' > channel_urls ``` -Some channels don't aren't prefixed with `/c/` `/channel/` or `/user/` in the url -so you'll either have to add them manually or change the `grep` regex to accept all url -which begin with `https://www.youtube.com/` and remove the links which aren't youtube channels. - -> Those channels are pretty rare tho, on my 300+ subscriptions I only had 2 or 3. - -## Choose the channels to add to your feeds +Some channels aren't prefixed with `/c/`, `/channel/` or `/user/` in the URL +so you'll either have to add them manually or change the `grep` regex to accept all URL +which begin with `https://www.youtube.com/` and remove the non YouTube channel links. -Now comes the tedious and cringing part where you need to through aaaall your old and obscure subscription and filter the bad ones out. -> Protip: If you want to automate this part you can ask Google to do it for you since they know you better than yourself by now. +> Those channels are pretty rare tho, on my 300+ subscriptions I only had 2 or 3 which fell in this category. +## Get each channel feed/name -## Get channel info +Maybe RSS readers understand the HTML tag `<link rel="alternate" type="application/rss" .../>` +but [newsboat](https://newsboat.org/) (which is the one I use) unfortunately doesn't. +This **pipeline of hell** will get all channel name and feed, +format them in lines that you can directly add to your `urls` file. -I guess most RSS reader understand the HTML /balise/ `<link rel="alternate" type="application/rss" .../>` but [newsboat]() (which I use) doesn't unfortunatly. -We can get the url to a channel feed with a simple `curl` into `grep`. +```sh +# stdbuf -oL force line buffering instead of output being buffered in pipe +# this allows to see the process in real time +# xargs - pass each channel url to curl +# grep - get channel name and feed url +# awk - uniq without sorted lines +# sed - join every 2 line +# sed - put feed url then the channel name in a comment after it +# sed - trim surrounding whitespaces +# tee - output to terminal and a urls file +# cat - print line number -``` xargs -a channel_urls curl -s | stdbuf -oL grep -o -E \ -e '<title>.* - YouTube</title>' \ -e 'https://www\.youtube\.com/feeds/videos\.xml\?channel_id=[a-zA-Z0-9_-]+' | - awk '!seen[$0]++' | - sed 's:<title>\(.*\) - YouTube</title>:\1:' | - sed 'N; s/\n/ /' | - sed 's/\(.*\) # \(https:.*\)/\2 # \1/' | - tee /dev/stderr 2> urls | + stdbuf -oL awk '!seen[$0]++' | + stdbuf -oL sed 'N; s/\n/ /' | + stdbuf -oL sed 's_\(.*\)<title>\(.*\) - YouTube<\/title>\(.*\)_\1\3 # \2_' | + stdbuf -oL sed 's/^ *\(.*\) *$/\1/' | + stdbuf -oL tee /dev/stderr 2> urls | cat -n ``` +The `urls` file should look something like this: + +``` +... +https://www.youtube.com/feeds/videos.xml?channel_id=UCS0N5baNlQWJCUrhCEo8WlA # Ben Eater +https://www.youtube.com/feeds/videos.xml?channel_id=UCld68syR8Wi-GY_n4CaoJGA # Brodie Robertson +https://www.youtube.com/feeds/videos.xml?channel_id=UCrv269YwJzuZL3dH5PCgxUw # CodeParade +https://www.youtube.com/feeds/videos.xml?channel_id=UCSX3MR0gnKDxyXAyljWzm0Q # Computer Science +... +``` + +### Choose the channels to add to your feeds + +Now comes the tedious and cringe part where you need to through all your old and obscure subscription, +filtering the bad ones out. + +> Pro tip: If you want to automate this part you can ask Google to do it for you, +> they know you better than yourself by now. + +## Stream videos to your computer + +### Dependencies + +* [youtube-dl](https://youtube-dl.org/) - download video from YouTube and other websites +* [mpv](https://mpv.io/) - simple video player + +### Stream + +Replace `<link>` in the command bellow and you should have a video streaming in `mpv`. + +``` +youtube-dl -o - <link> | mpv - +``` + +> You can use the video player you want, just read the man to know how to read a video from standard input +> (e.g for `vlc` it's `vlc -`) + +### Newsboat macro + +Add the following to `~/.newsboat/config` + +``` +macro v set browser "youtube-dl -o - %u | mpv -"; open-in-browser; set browser chromium +``` + +Now you can press `,` and `v` on a video feed entry to stream it to your player. + ## Sources -* [luke smith]() +* [Luke Smith video on RSS feeds](https://www.youtube.com/watch?v=hMH9w6pyzvU) +* `man stdbuf` +* [Line buffering with pipes](https://stackoverflow.com/questions/293278) +* [Remove duplicate lines without sorting?](https://stackoverflow.com/questions/11532157) +* [How to merge every two lines into one from the command line?](https://stackoverflow.com/questions/9605232) +* [RSS Wikipedia page](https://wikipedia.org/wiki/RSS) +* [Arch wiki - newsboat - pass article to external command](https://wiki.archlinux.org/index.php/Newsboat#Pass_article_URL_to_external_command) |
