Translating a Python Conditional to JavaScript Regex -



Translating a Python Conditional to JavaScript Regex -

i'm trying convert python regex javascript regex

https://github.com/rg3/youtube-dl/blob/a14e1538fe66c49ca8869681d2bbe60a36bd420d/youtube_dl/extractor/youtube.py#l134-l159

r"""(?x)^ ( (?:https?://|//)? # http(s):// or protocol-independent url (optional) (?:(?:(?:(?:\w+\.)?[yy][oo][uu][tt][uu][bb][ee](?:-nocookie)?\.com/| (?:www\.)?deturl\.com/www\.youtube\.com/| (?:www\.)?pwnyoutube\.com/| (?:www\.)?yourepeat\.com/| tube\.majestyc\.net/| youtube\.googleapis\.com/) # various hostnames, wildcard subdomains (?:.*?\#/)? # handle anchor (#/) redirect urls (?: # various things can precede id: (?:(?:v|embed|e)/) # v/ or embed/ or e/ |(?: # or v= param in forms (?:(?:watch|movie)(?:_popup)?(?:\.php)?/?)? # preceding watch(_popup|.php) or nil (like /?v=xxxx) (?:\?|\#!?) # params delimiter ? or # or #! (?:.*?&)? # other preceding param (like /?s=tuff&v=xxxx) v= ) )) |youtu\.be/ # youtu.be/xxxx |https?://(?:www\.)?cleanvideosearch\.com/media/action/yt/watch\?videoid= ) )? # until optional -> can pass naked id ([0-9a-za-z_-]{11}) # here it! youtube video id (?(1).+)? # if found id, can follow $"""

i removed quotes @ start , end, added start /^ , end delimiters /i, escaped forwards slashes, removed free-spacing mode , ended this

var valid_url = /^((?:https?:\/\/|\/\/)?(?:(?:(?:(?:\w+\.)?[yy][oo][uu][tt][uu][bb][ee](?:-nocookie)?\.com\/|(?:www\.)?deturl\.com\/www\.youtube\.com\/|(?:www\.)?pwnyoutube\.com\/|(?:www\.)?yourepeat\.com\/|tube\.majestyc\.net\/|youtube\.googleapis\.com\/)(?:.*?\#\/)?(?:(?:(?:v|embed|e)\/)|(?:(?:(?:watch|movie)(?:_popup)?(?:\.php)?\/?)?(?:\?|\#!?)(?:.*?&)?v=)))|youtu\.be\/|https?:\/\/(?:www\.)?cleanvideosearch\.com\/media\/action\/yt\/watch\?videoid=))?([0-9a-za-z_-]{11})(?(1).+)?$/g;

however javascript regex debugger i'm using says unexpected character "(" after "?" in regards javascript transpose of part of python regex

(?(1).+)? # if found id, can follow

any thought how can resolve error?

javascript not back upwards conditionals.

but world of regex has long survived without conditionals, , there ways around it.

the idea

the basic construction of scary regex this:

(capture a)? (match b) ( if captured, (match c)? )

you can translate if or:

(capture a) (match b) (match c)? **or** (match b)

converted regex

try this:

^((?:https?://|//)?(?:(?:(?:(?:\w+\.)?[yy][oo][uu][tt][uu][bb][ee](?:-nocookie)?\.com/|(?:www\.)?deturl\.com/www\.youtube\.com/|(?:www\.)?pwnyoutube\.com/|(?:www\.)?yourepeat\.com/|tube\.majestyc\.net/|youtube\.googleapis\.com/)(?:[^\n]*?#/)?(?:(?:(?:v|embed|e)/)|(?:(?:(?:watch|movie)(?:_popup)?(?:\.php)?/?)?(?:\?|#!?)(?:[^\n]*?&)?v=)))|youtu\.be/|https?://(?:www\.)?cleanvideosearch\.com/media/action/yt/watch\?videoid=)([0-9a-za-z_-]{11})(?:[^\n]+)?)|^([0-9a-za-z_-]{11})

explanation

the (?(1)[^\n]+)? conditional tries optionally match [^\n]+ if grouping 1 set. since occurs after non-optional ([0-9a-za-z_-]{11}), transformed conditional alternation |

i create no judgment suitability of regex... rearranged "grammar" without looking @ "words". :) either match whole grouping 1, straight roll ([0-9a-za-z_-]{11}) , optional component, or we straight match ([0-9a-za-z_-]{11}) if interested in retrieving ([0-9a-za-z_-]{11}), depending on side of alternation matches it, live within different capture group. i'll leave count parentheses. there lots of parentheses can remove, depending on needs

reference

conditional regex 101 if-then-else conditionals in regular expressions

javascript python regex translate

Comments

Popular posts from this blog

model view controller - MVC Rails Planning -

ruby on rails - Devise Logout Error in RoR -

html - Submenu setup with jquery and effect 'fold' -