├── .gitattributes ├── .gitignore ├── LICENSE ├── README.md ├── Setup.hs ├── example └── internals-pyspark-arrow.html ├── glancer.cabal ├── src ├── Captions.hs ├── Html.hs ├── Main.hs ├── Parser.hs └── Process.hs ├── stack.yaml ├── stack.yaml.lock ├── test └── Spec.hs └── this.code-workspace /.gitattributes: -------------------------------------------------------------------------------- 1 | * linguist-vendored 2 | *.hs linguist-vendored=false 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .stack-work 2 | .hie 3 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright Ruben Berenguel (c) 2021 2 | 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions are met: 7 | 8 | * Redistributions of source code must retain the above copyright 9 | notice, this list of conditions and the following disclaimer. 10 | 11 | * Redistributions in binary form must reproduce the above 12 | copyright notice, this list of conditions and the following 13 | disclaimer in the documentation and/or other materials provided 14 | with the distribution. 15 | 16 | * Neither the name of Author name here nor the names of other 17 | contributors may be used to endorse or promote products derived 18 | from this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 21 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 22 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 23 | A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 24 | OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 25 | SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 26 | LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 27 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 28 | THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 29 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 30 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Glancer 2 | 3 | > **glancer**: 4 | 5 | > NOUN _informal_ a person who glances 6 | 7 | > **glance**: 8 | 9 | > VERB If you glance at something or someone, you look at them very quickly and then look away again immediately. 10 | 11 | > VERB If you glance through or at a newspaper, report, or book, you **spend a short time looking at it without reading it very carefully**. 12 | 13 | --- 14 | 15 | [Regularly updated examples](https://github.com/rberenguel/glances) 16 | 17 | --- 18 | 19 | - [Glancer](#glancer) 20 | - [Installation](#installation) 21 | - [Usage](#usage) 22 | - [Notes and TODOs](#notes-and-todos) 23 | 24 | --- 25 | 26 | The amount of online conferences has skyrocketed lately, I wonder why. This has caused my _Pending to watch_ list to balloon from 30-ish pending technical videos (which are already a lot) to more than 100. 27 | 28 | There are then 2 problems: 29 | 30 | 1. I have too many techie conference videos to watch 31 | 2. In a lot of cases I realise halfway through that the subject wasn't that interesting or that I already know the area to be covered. 32 | 33 | For a long while I have had a similar problem with written articles. I solved it by: 34 | 35 | 1. Forcing me to read a substantial amount by [writing a weekly list of the best ones](https://mostlymaths.net/tags/readings/) 36 | 2. Brutally stop reading any article that is not good enough to _possibly be_ in that list. 37 | 38 | This is easy in writing: you can quickly scan the text and decide if it looks interesting enough for a deep dive in a few seconds (tech article reads range from a few minutes to around half an hour, depending of how technical it may get). But there is no way of doing it in videos! You need to watch maybe 15-20 minutes to then realise "meh". 39 | 40 | _Glancer_ should help with this. Given a YouTube url, it will: 41 | 42 | - Download the corresponding video (to a temporary folder), 43 | - Download the auto-generated subtitles (assumes English, hardcoded), 44 | - Capture images from the video every N=30 seconds (hardcoded for the moment), 45 | - Convert the images to base64, 46 | - Create a standalone webpage with the screenshots on the left and the corresponding text on the right. 47 | 48 | The goal is to be able to glance at the talk to decide if you really want to watch it or not. The _standalone_ part of the created webpage is to make it easier to "watch"/"share" to my iPad/iPhone without having to move a folder full of images. The whole talk becomes just a 5-15 Mb HTML file. 49 | 50 | A couple of additional neat (for me at least) features: 51 | 52 | - Clicking/tapping on the image will enlarge it, in case you want to see some code block larger (I wanted hover, but it was too tricky on mobile). 53 | - Clicking on the arrow on the lower-right of the slid block will open the video on Youtube, at that moment in time. 54 | 55 | ## Installation 56 | 57 | ```bash 58 | git clone https://github.com/rberenguel/glancer 59 | cd glancer 60 | stack install 61 | ``` 62 | 63 | You will need to have installed/available in the path: 64 | 65 | - The `base64` executable (should be in all IX systems by default) 66 | - `cat` in `/bin/cat` (likewise) 67 | - [`yt-dlp`](https://github.com/yt-dlp/yt-dlp) installed. Note the _p_. There is currently a bug in the normal one with downloading auto-generated subtitles (again, this also happened a long time ago). 68 | - The [stack](https://docs.haskellstack.org/en/stable/install_and_upgrade/) Haskell build tool 69 | 70 | ## Usage 71 | 72 | ``` 73 | Usage: glancer URL FILEPATH 74 | Glancer 75 | 76 | Available options: 77 | URL Youtube URL 78 | FILEPATH HTML file name (don't add extension) 79 | -h,--help Show this help text 80 | ``` 81 | 82 | In other words, `glancer https://www.youtube.com/watch?v=JWQxd3YKWhs internals-pyspark-arrow` would create the webpage `internals-pyspark-arrow.html` in the current folder, after processing the talk I gave at Spark Summit 2019. You can see the generated file [here](https://www.mostlymaths.net/glancer/example/internals-pyspark-arrow.html). 83 | 84 | Sometimes `youtube-dlc` won't be able to find the embedded youtube video (I've seen this happen randomly in Spark Summit North America 2020 videos in databricks.com), in this case the process will fail. Try to feed it youtube urls directly. 85 | 86 | ## Notes and TODOs 87 | 88 | - [ ] Making the time between images customizable via the CLI (if I find out 30 is not good enough in general). 89 | - [ ] Add a test suite to harden subtitle parsing. I always think parsers will be small enough and that it will be "obvious" they work. It's never the case, at least I did it [right](https://github.com/rberenguel/haskset/blob/master/test/Spec.hs) [twice](https://github.com/rberenguel/bear-note-graph/blob/master/tests/test_parser.py). 90 | - [ ] Make the still images video-dependent (so several `glancer` commands can run concurrently, even if it's a bad idea) 91 | - [x] ~Some additional tweaks to the HTML/CSS (possibly adding some JS as well)~ 92 | 93 | ## Similar projects 94 | 95 | ### [natural-language-youtube-search](https://github.com/haltakov/natural-language-youtube-search) 96 | 97 | This project downloads the YouTube video, extracts every N-th frame and uses neural networks to classify the content of each slide, allowing you to search by text. Impressive! 98 | 99 | --- 100 | 101 | _Note_: This README is long and winding on purpose. 102 | -------------------------------------------------------------------------------- /Setup.hs: -------------------------------------------------------------------------------- 1 | import Distribution.Simple 2 | main = defaultMain 3 | -------------------------------------------------------------------------------- /glancer.cabal: -------------------------------------------------------------------------------- 1 | cabal-version: 2.2 2 | name: glancer 3 | version: 0.1.0.0 4 | -- synopsis: 5 | -- description: 6 | homepage: https://github.com/rberenguel/glancer#readme 7 | license: BSD-3-Clause 8 | license-file: LICENSE 9 | author: Ruben Berenguel 10 | copyright: 2021 Ruben Berenguel 11 | category: Misc 12 | build-type: Simple 13 | extra-source-files: README.md 14 | 15 | common deps 16 | build-depends: base >= 4.7 && < 5, text, 17 | bytestring, raw-strings-qq, 18 | megaparsec, directory, filepath, mtl, 19 | html-entities 20 | default-language: Haskell2010 21 | ghc-options: -fwarn-incomplete-patterns 22 | -fwarn-unused-imports 23 | -Werror=incomplete-patterns 24 | -fwrite-ide-info 25 | -hiedir=.hie 26 | 27 | executable glancer 28 | import: deps 29 | hs-source-dirs: src 30 | main-is: Main.hs 31 | other-modules: Parser, Process, Html, Captions 32 | build-depends: optparse-applicative, hspec, containers, 33 | process, temporary,random, process-extras 34 | 35 | test-suite tests 36 | import: deps 37 | type: exitcode-stdio-1.0 38 | hs-source-dirs: test, src 39 | main-is: Spec.hs 40 | build-depends: hspec, QuickCheck, hspec-megaparsec -------------------------------------------------------------------------------- /src/Captions.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | 3 | module Captions where 4 | 5 | import Data.Coerce (coerce) 6 | import Data.Foldable (Foldable (toList)) 7 | import Data.Sequence (fromList, mapWithIndex) 8 | import Data.Text (intercalate, strip) 9 | import qualified Data.Text as T 10 | import Data.Text.Encoding (decodeUtf8) 11 | import Html (embody, heading) 12 | import Parser (Caption (block, txt), TimeBlock (..), TimeDef (..), inBlock, secs) 13 | import Process 14 | ( Dir (..), 15 | Url (..), 16 | Video (..), 17 | deleteImages, 18 | ) 19 | import System.IO 20 | ( hPutStrLn, 21 | stderr, 22 | ) 23 | import qualified System.Process.ByteString as B 24 | import Text.Printf (printf) 25 | 26 | convertToHTML :: Show a => Video -> Dir -> Either a [Caption] -> IO T.Text 27 | convertToHTML video dir parsed = case parsed of 28 | Right captions -> do 29 | html <- captionsToHTML video dir captions 30 | hPutStrLn stderr "Finished" 31 | return html 32 | Left err -> return (T.pack (show err)) 33 | 34 | captionsToHTML :: Video -> Dir -> [Caption] -> IO T.Text 35 | captionsToHTML video dir captions = do 36 | caps <- formatCaptions captions (coerce $ url video) dir 37 | deleteImages dir 38 | let embodied = embody video caps 39 | return (heading <> embodied) 40 | 41 | formatCaptions :: [Caption] -> Url -> Dir -> IO T.Text 42 | formatCaptions captions (Url url) (Dir dir) = do 43 | let toMap = fromList $ captionsPerSlide captions 44 | let listy = toList (mapWithIndex (imgCaps url (T.pack dir)) toMap) 45 | intercalate "\n" <$> sequence listy 46 | 47 | imgCaps :: Integral a => T.Text -> T.Text -> a -> [Caption] -> IO T.Text 48 | imgCaps url dir ind captions = do 49 | --let next = toInteger ind + 1 50 | img <- slideBlock url dir (toInteger ind) --next 51 | let toVideo = toVideoBlock url (toInteger ind) 52 | return (img <> caps captions <> toVideo <> "") 53 | 54 | slideBlock :: T.Text -> T.Text -> Integer -> IO T.Text 55 | slideBlock url dir shot = do 56 | let imgPath = T.unpack (dir <> "/glancer-img" <> T.pack (printf "%04d.jpg" shot)) 57 | (_, jpg, _) <- B.readProcessWithExitCode "/bin/cat" [imgPath] "" 58 | (_, base64, _) <- B.readProcessWithExitCode "base64" [] jpg 59 | let slide = "
\n" 60 | let div = "\t
\n" 61 | let img = "\t\t\n" 62 | let close = "\t
\n" 63 | let formatted = slide <> div <> img <> close 64 | return formatted 65 | 66 | toVideoBlock :: T.Text -> Integer -> T.Text 67 | toVideoBlock url shot = do 68 | let when = T.pack $ show (shotSeconds shot 30) 69 | let title = "title='Go to video at timestamp " <> when <> "s'" 70 | let diva = "
title <> "href='" <> url <> "&t=" <> when <> "s'>⇰
" 71 | diva 72 | 73 | caps :: [Caption] -> T.Text 74 | caps captions = "\t
\n" <> intercalate "\n" (stripped captions) <> "\n\t
" 75 | where 76 | stripped captions = map (("\t\t" <>) . strip . txt) captions 77 | 78 | captionsPerSlide :: [Caption] -> [[Caption]] 79 | captionsPerSlide captions = map (capsForShot (filtering captions) 30) (shots (filtering captions)) 80 | where 81 | filtering lst = filter (not . (T.isInfixOf "" . txt)) lst 82 | shots captions = [1 .. numShots captions 30 + 1] 83 | 84 | shotSeconds :: Integer -> Integer -> Integer 85 | shotSeconds shotNumber secsPerShot = shotNumber * secsPerShot 86 | 87 | hourMinuteSeconds :: Integer -> TimeDef 88 | hourMinuteSeconds seconds = TimeDef h m s 0 89 | where 90 | h = seconds `div` 3600 91 | m = seconds `mod` 3600 `div` 60 92 | s = seconds `mod` 3600 `mod` 60 93 | 94 | shotTime :: Integer -> Integer -> TimeBlock 95 | shotTime shotNumber secsPerShot = TimeBlock startTime endTime 96 | where 97 | startTime = hourMinuteSeconds start_ 98 | endTime = hourMinuteSeconds end_ 99 | start_ = shotSeconds (shotNumber - 1) secsPerShot 100 | end_ = shotSeconds shotNumber secsPerShot 101 | 102 | numShots :: [Caption] -> Integer -> Integer 103 | numShots caps secsPerShot = secs (end . block . last $ caps) `div` secsPerShot 104 | 105 | capsForShot :: [Caption] -> Integer -> Integer -> [Caption] 106 | capsForShot caps secsPerShot shotNumber = filter (inTimeBlock shotBlock) caps 107 | where 108 | shotBlock = shotTime shotNumber secsPerShot 109 | inTimeBlock blck cap = inBlock (start . block $ cap) blck && inBlock (end . block $ cap) blck 110 | -------------------------------------------------------------------------------- /src/Html.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | {-# LANGUAGE QuasiQuotes #-} 3 | 4 | module Html where 5 | 6 | import qualified Data.Text as T 7 | import qualified HTMLEntities.Text as H 8 | import Process (Title (..), Url (..), Video (..)) 9 | import Text.RawString.QQ (r) 10 | 11 | heading :: T.Text 12 | heading = 13 | T.pack 14 | [r| 15 | 16 | 17 | 18 | 19 | 104 | 111 | 112 | |] 113 | 114 | embody :: Video -> T.Text -> T.Text 115 | embody (Video (Url url) (Title title) _) body = 116 | T.intercalate 117 | "\n" 118 | [ "\t", 119 | "\t\t
", 120 | "\t\t\t

" <> H.text title <> "

", 121 | "\t\t\t

Created with glancer

", 122 | body, 123 | "", 124 | "" 125 | ] 126 | -------------------------------------------------------------------------------- /src/Main.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | 3 | module Main where 4 | 5 | import Captions (convertToHTML) 6 | import Data.Coerce (coerce) 7 | import Data.Text (pack) 8 | import qualified Data.Text as T 9 | import GHC.IO.Encoding.Latin1 (ascii) 10 | import qualified Options.Applicative as Ap 11 | import Parser (subsP) 12 | import Process 13 | ( Dir (..), 14 | Filename (..), 15 | Url (..), 16 | Video (..), 17 | processURL, 18 | ) 19 | import System.Directory (getHomeDirectory) 20 | import System.FilePath (joinPath, splitPath, (-<.>), ()) 21 | import System.IO 22 | ( IOMode (ReadMode), 23 | hGetContents, 24 | hPutStrLn, 25 | hSetEncoding, 26 | openFile, 27 | stderr, 28 | ) 29 | import Text.Megaparsec (parse) 30 | import Prelude 31 | 32 | data CLIConfig = CLIConfig 33 | { _url :: String, 34 | _filename :: String 35 | } 36 | 37 | cliConfig :: Ap.Parser CLIConfig 38 | cliConfig = 39 | CLIConfig 40 | <$> Ap.strArgument (Ap.metavar "URL" <> Ap.help "Youtube URL") 41 | <*> Ap.strArgument (Ap.metavar "FILEPATH" <> Ap.help "HTML file name (don't add extension)") 42 | 43 | getFullPath :: FilePath -> IO FilePath 44 | getFullPath s = case splitPath s of 45 | "~/" : t -> joinPath . (: t) <$> getHomeDirectory 46 | _ -> return s 47 | 48 | start :: CLIConfig -> IO () 49 | start (CLIConfig url filename) = do 50 | hPutStrLn stderr ("Looking for video in " ++ url) 51 | (dir_, video) <- processURL (Url $ T.pack url) 52 | let videoName = coerce (file video) 53 | let dir = coerce dir_ 54 | capsPath <- getFullPath (dir videoName -<.> "en.vtt") 55 | handle <- openFile capsPath ReadMode 56 | hSetEncoding handle ascii 57 | contents <- hGetContents handle 58 | let parsed = parse subsP "" (pack contents) 59 | html <- convertToHTML video dir_ parsed 60 | destinationPath <- getFullPath (filename -<.> "html") 61 | hPutStrLn stderr ("Writing html to " ++ destinationPath) 62 | writeFile destinationPath (T.unpack html) 63 | hPutStrLn stderr ("Data written to " ++ destinationPath) 64 | 65 | main :: IO () 66 | main = do 67 | Main.start =<< Ap.execParser opts 68 | where 69 | opts = 70 | Ap.info 71 | (cliConfig Ap.<**> Ap.helper) 72 | ( Ap.fullDesc 73 | <> Ap.progDesc "Glancer" 74 | <> Ap.header "Why not" 75 | ) 76 | -------------------------------------------------------------------------------- /src/Parser.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | 3 | module Parser where 4 | 5 | import Control.Monad (void) 6 | import qualified Data.Text as T 7 | import Data.Void (Void) 8 | import Text.Megaparsec 9 | ( MonadParsec (eof, lookAhead, try), 10 | Parsec, 11 | manyTill, 12 | (<|>), 13 | ) 14 | import Text.Megaparsec.Char 15 | ( asciiChar, 16 | char, 17 | eol, 18 | letterChar, 19 | numberChar, 20 | space, 21 | spaceChar, 22 | string, 23 | ) 24 | 25 | type Parser = Parsec Void T.Text 26 | 27 | data TimeDef = TimeDef 28 | { hours :: Integer, 29 | minutes :: Integer, 30 | seconds :: Integer, 31 | cents :: Integer 32 | } 33 | deriving (Show, Eq) 34 | 35 | secs :: TimeDef -> Integer 36 | secs td = 3600 * hours td + 60 * minutes td + seconds td + 1 37 | 38 | instance Ord TimeDef where 39 | compare a b = compare (secs a) (secs b) 40 | 41 | data TimeBlock = TimeBlock 42 | { start :: TimeDef, 43 | end :: TimeDef 44 | } 45 | deriving (Show, Eq) 46 | 47 | inBlock :: TimeDef -> TimeBlock -> Bool 48 | inBlock td tb = (secs . start $ tb) <= secstd && secstd <= (secs . end $ tb) 49 | where 50 | secstd = secs td 51 | 52 | data Caption = Caption 53 | { block :: TimeBlock, 54 | txt :: T.Text 55 | } 56 | deriving (Show, Eq) 57 | 58 | hourBlockP :: Parser TimeDef 59 | hourBlockP = do 60 | hours <- colonSeparated 61 | minutes <- colonSeparated 62 | seconds <- read <$> manyTill numberChar (char '.') 63 | millis <- read <$> manyTill numberChar (void spaceChar <|> try (lookAhead (void eol))) 64 | return (TimeDef hours minutes seconds millis) 65 | where 66 | colonSeparated = read <$> manyTill numberChar (char ':') 67 | 68 | arrowP :: Parser T.Text 69 | arrowP = do 70 | string "-->" 71 | 72 | anyChar :: Parser Char 73 | anyChar = letterChar <|> spaceChar <|> asciiChar 74 | 75 | timeLineP :: Parser TimeBlock 76 | timeLineP = do 77 | start <- hourBlockP 78 | arrowP 79 | space 80 | end <- hourBlockP 81 | manyTill anyChar eol 82 | return (TimeBlock start end) 83 | 84 | captionP :: Parser Caption 85 | captionP = do 86 | space 87 | block <- timeLineP 88 | caption <- T.pack <$> manyTill anyChar (try $ lookAhead (void timeLineP <|> void eof)) 89 | return (Caption block caption) 90 | 91 | subsP :: Parser [Caption] 92 | subsP = do 93 | string "WEBVTT" 94 | manyTill anyChar (try $ lookAhead timeLineP) 95 | space 96 | manyTill captionP eof 97 | -------------------------------------------------------------------------------- /src/Process.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | 3 | module Process where 4 | 5 | import Control.Monad (replicateM) 6 | import qualified Data.ByteString.Char8 as B 7 | import Data.Coerce (coerce) 8 | import qualified Data.Text as T 9 | import qualified Data.Text.Encoding as TE 10 | import System.Directory (getTemporaryDirectory) 11 | import System.FilePath ((-<.>), ()) 12 | import System.IO 13 | ( hPutStrLn, 14 | stderr, 15 | ) 16 | import System.Process (callCommand, callProcess, readProcess) 17 | import System.Random (Random (randomRIO)) 18 | 19 | newtype Url = Url T.Text 20 | 21 | newtype Title = Title T.Text 22 | 23 | newtype Id = Id T.Text 24 | 25 | newtype Dir = Dir String 26 | 27 | newtype Filename = Filename String 28 | 29 | data Video = Video 30 | { url :: Url, 31 | title :: Title, 32 | file :: Filename 33 | } 34 | 35 | getTitle :: Url -> IO Title 36 | getTitle (Url url) = do 37 | title <- readProcess "yt-dlp" ["-e", "--no-warnings", "--no-playlist", T.unpack url] [] 38 | return (Title $ TE.decodeUtf8 $ B.pack title) -- This deletes badly encoded characters, which is better than having them but worse than properly encoding them. Help appreciated 39 | 40 | getId :: Url -> IO Id 41 | getId (Url url) = do 42 | id <- readProcess "yt-dlp" ["--get-id", "--no-warnings", "--no-playlist", T.unpack url] [] 43 | return (Id $ T.pack id) 44 | 45 | youtubeURL :: Id -> Url 46 | youtubeURL (Id id) = Url ("https://www.youtube.com/watch?v=" <> id) 47 | 48 | generateVideo :: Video -> Dir -> IO () 49 | generateVideo (Video (Url url) _ (Filename videoName)) (Dir dir) = do 50 | callProcess command arguments 51 | where 52 | command = "yt-dlp" 53 | arguments = ["-q", "--no-playlist", "-f mp4", coerce ("-o" <> dir videoName -<.> "mp4"), "--sub-langs", "en", "--write-auto-sub", "--write-sub", "--no-warnings", "-k", "--no-cache-dir", T.unpack url] 54 | 55 | args :: Dir -> Filename -> [String] -> String -> [String] 56 | args (Dir dir) (Filename videoName) selector suffix = [ "-i", 57 | coerce (dir videoName -<.> "mp4")] 58 | ++ selector ++ [ 59 | coerce (dir "glancer-img" <> suffix -<.> "jpg"), 60 | "-hide_banner", 61 | "-loglevel", 62 | "panic" 63 | ] 64 | 65 | generateShots :: Dir -> Filename -> IO () 66 | generateShots dir video = do 67 | callProcess "ffmpeg" (args dir video ["-vf", "fps=1/30"] "%04d") 68 | callProcess "ffmpeg" (args dir video ["-vframes", "1", "-ss", "3"] "0000" ) 69 | 70 | deleteVideo :: Dir -> Filename -> IO () 71 | deleteVideo (Dir dir) (Filename videoName) = callProcess "rm" [coerce (dir videoName -<.> "mp4")] 72 | 73 | deleteImages :: Dir -> IO () 74 | deleteImages (Dir dir) = callCommand $ T.unpack ("rm " <> T.pack (dir "glancer-img*")) 75 | 76 | processURL :: Url -> IO (Dir, Video) 77 | processURL url = do 78 | dir <- Dir <$> getTemporaryDirectory 79 | videoName <- Filename <$> replicateM 10 (randomRIO ('a', 'z')) 80 | title <- getTitle url 81 | hPutStrLn stderr $ T.unpack ("The video is titled '" <> T.strip (coerce title) <> "'") 82 | id <- getId url 83 | let yourl = youtubeURL id 84 | hPutStrLn stderr $ T.unpack (T.strip ("Seems like the video is in " <> coerce yourl)) 85 | let video = Video yourl title videoName 86 | hPutStrLn stderr "Downloading video (this may take a while)" 87 | generateVideo video dir 88 | hPutStrLn stderr (T.unpack ("Downloaded video to " <> (T.pack . coerce) dir <> (T.pack . coerce) videoName <> "(.mp4|en.vtt)")) 89 | hPutStrLn stderr "Generating still images from video (this may take a while)" 90 | generateShots dir videoName 91 | hPutStrLn stderr "Generated images" 92 | deleteVideo dir videoName 93 | return (dir, video) 94 | -------------------------------------------------------------------------------- /stack.yaml: -------------------------------------------------------------------------------- 1 | # This file was automatically generated by 'stack init' 2 | # 3 | # Some commonly used options have been documented as comments in this file. 4 | # For advanced use and comprehensive documentation of the format, please see: 5 | # https://docs.haskellstack.org/en/stable/yaml_configuration/ 6 | 7 | # Resolver to choose a 'specific' stackage snapshot or a compiler version. 8 | # A snapshot resolver dictates the compiler version and the set of packages 9 | # to be used for project dependencies. For example: 10 | # 11 | # resolver: lts-3.5 12 | # resolver: nightly-2015-09-21 13 | # resolver: ghc-7.10.2 14 | # 15 | # The location of a snapshot can be provided as a file or url. Stack assumes 16 | # a snapshot provided as a file might change, whereas a url resource does not. 17 | # 18 | # resolver: ./custom-snapshot.yaml 19 | # resolver: https://example.com/snapshots/2018-01-01.yaml 20 | resolver: lts-16.31 21 | 22 | 23 | # User packages to be built. 24 | # Various formats can be used as shown in the example below. 25 | # 26 | # packages: 27 | # - some-directory 28 | # - https://example.com/foo/bar/baz-0.0.2.tar.gz 29 | # subdirs: 30 | # - auto-update 31 | # - wai 32 | packages: 33 | - . 34 | # Dependency packages to be pulled from upstream that are not in the resolver. 35 | # These entries can reference officially published versions as well as 36 | # forks / in-progress versions pinned to a git hash. For example: 37 | # 38 | # extra-deps: 39 | # - acme-missiles-0.3 40 | # - git: https://github.com/commercialhaskell/stack.git 41 | # commit: e7b331f14bcffb8367cd58fbfc8b40ec7642100a 42 | # 43 | # extra-deps: [] 44 | 45 | # Override default flag values for local packages and extra-deps 46 | # flags: {} 47 | 48 | # Extra package databases containing global packages 49 | # extra-package-dbs: [] 50 | 51 | # Control whether we use the GHC we find on the path 52 | # system-ghc: true 53 | # 54 | # Require a specific version of stack, using version ranges 55 | # require-stack-version: -any # Default 56 | # require-stack-version: ">=2.3" 57 | # 58 | # Override the architecture used by stack, especially useful on Windows 59 | # arch: i386 60 | # arch: x86_64 61 | # 62 | # Extra directories used by stack for building 63 | # extra-include-dirs: [/path/to/dir] 64 | # extra-lib-dirs: [/path/to/dir] 65 | # 66 | # Allow a newer minor version of GHC than the snapshot specifies 67 | # compiler-check: newer-minor 68 | -------------------------------------------------------------------------------- /stack.yaml.lock: -------------------------------------------------------------------------------- 1 | # This file was autogenerated by Stack. 2 | # You should not edit this file by hand. 3 | # For more information, please see the documentation at: 4 | # https://docs.haskellstack.org/en/stable/lock_files 5 | 6 | packages: [] 7 | snapshots: 8 | - completed: 9 | size: 534126 10 | url: https://raw.githubusercontent.com/commercialhaskell/stackage-snapshots/master/lts/16/31.yaml 11 | sha256: 637fb77049b25560622a224845b7acfe81a09fdb6a96a3c75997a10b651667f6 12 | original: lts-16.31 13 | -------------------------------------------------------------------------------- /test/Spec.hs: -------------------------------------------------------------------------------- 1 | {-# LANGUAGE OverloadedStrings #-} 2 | {-# LANGUAGE QuasiQuotes #-} 3 | 4 | import Data.Text (pack) 5 | import qualified Parser as P 6 | import Test.Hspec ( hspec, describe, it, Spec ) 7 | import Test.Hspec.Megaparsec ( shouldParse ) 8 | import Text.Megaparsec ( parse ) 9 | import Text.RawString.QQ ( r ) 10 | 11 | main :: IO () 12 | main = hspec spec 13 | 14 | hourBlock = "01:07:46.029 " 15 | 16 | parsedHourBlock = P.TimeDef 1 7 46 29 17 | 18 | timeBlock = 19 | pack 20 | [r|01:07:46.029 --> 01:07:52.319 align:start position:0% 21 | |] 22 | 23 | parsedTimeBlock = P.TimeBlock parsedHourBlock (P.TimeDef 1 7 52 319) 24 | 25 | captionBlock = 26 | pack 27 | [r| 28 | 01:07:46.029 --> 01:07:52.319 align:start position:0% 29 | really look into but we have considered 30 | |] 31 | 32 | parsedCaptionBlock = P.Caption parsedTimeBlock "really look into but we have considered\n" 33 | 34 | spec :: Spec 35 | spec = do 36 | describe "arrowP" $ do 37 | it "parses a time arrow" $ 38 | parse P.arrowP "" "-->" `shouldParse` "-->" 39 | describe "hourBlockP" $ do 40 | it "parses a hourBlock" $ 41 | parse P.hourBlockP "" hourBlock `shouldParse` parsedHourBlock 42 | describe "timeBlockP" $ do 43 | it "parses a timeBlock" $ 44 | parse P.timeLineP "" timeBlock `shouldParse` parsedTimeBlock 45 | describe "captionP" $ do 46 | it "parses a captionBlock" $ parse P.captionP "" captionBlock `shouldParse` parsedCaptionBlock 47 | -------------------------------------------------------------------------------- /this.code-workspace: -------------------------------------------------------------------------------- 1 | { 2 | "folders": [ 3 | { 4 | "path": "." 5 | } 6 | ], 7 | "settings": { 8 | "files.watcherExclude": { 9 | "**/target": true 10 | } 11 | } 12 | } --------------------------------------------------------------------------------