查看“模块:Citation/CS1/Links”的源代码
←
模块:Citation/CS1/Links
跳转到导航
跳转到搜索
因为以下原因,您没有权限编辑该页面:
您请求的操作仅限属于该用户组的用户执行:
用户
您可以查看和复制此页面的源代码。
--[[ 本模块用于处理维基内链和外链。目前导出四个函数: check_for_external_link()用于批量检查参数中可能存在的外链。 make_external_link()依据输入的url参数和"显示"参数输出维基外链, 并对其中url参数的格式进行检查。若查出url格式不正确, 在设定"源"参数的情况下,会根据该参数产生"引文格式1错误"信息, 附于生成的外链后方;如"源"参数未设定,则会抛出内部错误终止程序。 make_internal_link()依据输入的"目标页面"参数和"显示"参数输出维基内链, 并对其中"目标页面"参数的格式稍加检查。 若查出"目标页面"格式不正确(例如:含有页面标题中不允许出现的字符), 在设定"源"参数的情况下,会根据该参数产生"引文格式1错误"信息, 该条信息被其它模块处理后会显示于引文的尾部; 如"源"参数未设定,则会抛出内部错误终止程序。 以上两个函数既可用于处理模板参数,也可用于转换配置模块中的有关参数。 但请注意,用于前者时必须指定"源"参数(即被处理的参数名称)并保证其值有效, 用于后者时必须确保配置文件模块编写正确。否则将导致Lua错误! remove_wiki_link()用于移除维基内链。 ]] --[[--------------------------< F O R W A R D D E C L A R A T I O N S >-------------------------------------- ]] local is_set; local append_error, set_error, throw_error; --[[--------------------------< I S _ S C H E M E >------------------------------------------------------------ does this thing that purports to be a uri scheme seem to be a valid scheme? The scheme is checked to see if it is in agreement with http://tools.ietf.org/html/std66#section-3.1 which says: Scheme names consist of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-"). returns true if it does, else false ]] local function is_scheme (scheme) return scheme and scheme:match ('^%a[%a%d%+%.%-]*:'); -- true if scheme is set and matches the pattern end --[=[-------------------------< I S _ D O M A I N _ N A M E >-------------------------------------------------- Does this thing that purports to be a domain name seem to be a valid domain name? Syntax defined here: http://tools.ietf.org/html/rfc1034#section-3.5 BNF defined here: https://tools.ietf.org/html/rfc4234 Single character names are generally reserved; see https://tools.ietf.org/html/draft-ietf-dnsind-iana-dns-01#page-15; see also [[Single-letter second-level domain]] list of tlds: https://www.iana.org/domains/root/db rfc952 (modified by rfc 1123) requires the first and last character of a hostname to be a letter or a digit. Between the first and last characters the name may use letters, digits, and the hyphen. Also allowed are IPv4 addresses. IPv6 not supported domain is expected to be stripped of any path so that the last character in the last character of the tld. tld is two or more alpha characters. Any preceding '//' (from splitting a url with a scheme) will be stripped here. Perhaps not necessary but retained incase it is necessary for IPv4 dot decimal. There are several tests: the first character of the whole domain name including subdomains must be a letter or a digit single-letter/digit second-level domains in the .org TLD q, x, and z SL domains in the .com TLD i and q SL domains in the .net TLD single-letter SL domains in the ccTLDs (where the ccTLD is two letters) two-character SL domains in gTLDs (where the gTLD is two or more letters) three-plus-character SL domains in gTLDs (where the gTLD is two or more letters) IPv4 dot-decimal address format; TLD not allowed returns true if domain appears to be a proper name and tld or IPv4 address, else false ]=] local function is_domain_name (domain) if not domain then return false; -- if not set, abandon end domain = domain:gsub ('^//', ''); -- strip '//' from domain name if present; done here so we only have to do it once if not domain:match ('^[%a%d]') then -- first character must be letter or digit return false; end if domain:match ('%f[%a%d][%a%d]%.org$') then -- one character .org hostname return true; elseif domain:match ('%f[%a][qxz]%.com$') then -- assigned one character .com hostname (x.com times out 2015-12-10) return true; elseif domain:match ('%f[%a][iq]%.net$') then -- assigned one character .net hostname (q.net registered but not active 2015-12-10) return true; elseif domain:match ('%f[%a%d][%a%d][%a%d%-]+[%a%d]%.xn%-%-[%a%d]+$') then -- internationalized domain name with ACE prefix return true; elseif domain:match ('%f[%a%d][%a%d]%.cash$') then -- one character/digit .cash hostname return true; elseif domain:match ('%f[%a%d][%a%d]%.%a%a$') then -- one character hostname and cctld (2 chars) return true; elseif domain:match ('%f[%a%d][%a%d][%a%d]%.%a%a+$') then -- two character hostname and tld return true; elseif domain:match ('%f[%a%d][%a%d][%a%d%-]+[%a%d]%.%a%a+$') then -- three or more character hostname.hostname or hostname.tld return true; elseif domain:match ('^%d%d?%d?%.%d%d?%d?%.%d%d?%d?%.%d%d?%d?') then -- IPv4 address return true; else return false; end end --[[--------------------------< I S _ U R L >------------------------------------------------------------------ returns true if the scheme and domain parts of a url appear to be a valid url; else false. This function is the last step in the validation process. This function is separate because there are cases that are not covered by split_url(), for example is_parameter_ext_wikilink() which is looking for bracketted external wikilinks. ]] local function is_url (scheme, domain) if is_set (scheme) then -- if scheme is set check it and domain return is_scheme (scheme) and is_domain_name (domain); else return is_domain_name (domain); -- scheme not set when url is protocol relative end end --[[--------------------------< S P L I T _ U R L >------------------------------------------------------------ Split a url into a scheme, authority indicator, and domain. If protocol relative url, return nil scheme and domain else return nil for both scheme and domain. When not protocol relative, get scheme, authority indicator, and domain. If there is an authority indicator (one or more '/' characters following the scheme's colon), make sure that there are only 2. ]] local function split_url (url_str) local scheme, authority, domain; url_str = url_str:gsub ('([%a%d])%.?[/%?#].*$', '%1'); -- strip FQDN terminator and path(/), query(?), fragment (#) (the capture prevents false replacement of '//') if url_str:match ('^//%S*') then -- if there is what appears to be a protocol relative url domain = url_str:match ('^//(%S*)') elseif url_str:match ('%S-:/*%S+') then -- if there is what appears to be a scheme, optional authority indicator, and domain name scheme, authority, domain = url_str:match ('(%S-:)(/*)(%S+)'); -- extract the scheme, authority indicator, and domain portions authority = authority:gsub ('//', '', 1); -- replace place 1 pair of '/' with nothing; if is_set(authority) then -- if anything left (1 or 3+ '/' where authority should be) then return scheme; -- return scheme only making domain nil which will cause an error message end domain = domain:gsub ('(%a):%d+', '%1'); -- strip port number if present end return scheme, domain; end --[[--------------------------< L I N K _ P A R A M _ O K >--------------------------------------------------- checks the content of |title-link=, |series-link=, |author-link= etc for properly formatted content: no wikilinks, no urls Link parameters are to hold the title of a wikipedia article so none of the WP:TITLESPECIALCHARACTERS are allowed: # < > [ ] | { } _ except the underscore which is used as a space in wiki urls and # which is used for section links returns false when the value contains any of these characters. When there are no illegal characters, this function returns TRUE if value DOES NOT appear to be a valid url (the |<param>-link= parameter is ok); else false when value appears to be a valid url (the |<param>-link= parameter is NOT ok). ]] local function link_param_ok (value) local scheme, domain; if value:find ('[<>%[%]|{}]') then -- if any prohibited characters return false; end scheme, domain = split_url (value); -- get scheme or nil and domain or nil from url; return not is_url (scheme, domain); -- return true if value DOES NOT appear to be a valid url end --[=[-------------------------< M A K E _ I N T E R N A L _ L I N K >------------------------------------------ Format a wikilink with error checking; when both link and display text is provided, returns a wikilink in the form [[L|D]]; if only link is provided (or link and display are the same), returns a wikilink in the form [[L]]. ]=] local function make_internal_link (link, display, source) if not link_param_ok (link) then if is_set (source) then append_error ('bad_paramlink', {source}); else throw_error('bad_link_no_origin'); end end if is_set (display) and link ~= display then return table.concat ({'[[', link, '|', display, ']]'}); else return table.concat ({'[[', link, ']]'}); end end --[=[-------------------------< R E M O V E _ W I K I _ L I N K >---------------------------------------------- Gets the display text from a wikilink like [[A|B]] or [[B]] gives B The str:gsub() returns either A|B froma [[A|B]] or B from [[B]] or B from B (no wikilink markup). In l(), l:gsub() removes the link and pipe (if they exist); the second :gsub() trims whitespace from the label if str was wrapped in wikilink markup. Presumably, this is because without wikimarkup in str, there is no match in the initial gsub, the replacement function l() doesn't get called. ]=] local function remove_wiki_link (str) return (str:gsub ("%[%[([^%[%]]*)%]%]", function(l) return l:gsub ("^[^|]*|(.*)$", "%1" ):gsub ("^%s*(.-)%s*$", "%1"); end)); end --[=[-------------------------< I S _ W I K I L I N K >-------------------------------------------------------- Determines if str is a wikilink, extracts, and returns the wikilink type, link text, and display text parts. If str is a complex wikilink ([[L|D]]): returns wl_type 2 and D and L from [[L|D]]; if str is a simple wikilink ([[D]]) returns wl_type 1 and D from [[D]] and L as empty string; if not a wikilink: returns wl_type 0, str as D, and L as empty string. trims leading and trailing whitespace and pipes from L and D ([[L|]] and [[|D]] are accepted by MediaWiki and treated like [[D]]; while [[|D|]] is not accepted by MediaWiki, here, we accept it and return D without the pipes). ]=] local function is_wikilink (str) local D, L local wl_type = 2; -- assume that str is a complex wikilink [[L|D]] if not str:match ('^%[%[[^%]]+%]%]$') then -- is str some sort of a wikilink (must have some sort of content) return 0, str, ''; -- not a wikilink; return wl_type as 0, str as D, and empty string as L end L, D = str:match ('^%[%[([^|]+)|([^%]]+)%]%]$'); -- get L and D from [[L|D]] if not is_set (D) then -- if no separate display D = str:match ('^%[%[([^%]]*)|*%]%]$'); -- get D from [[D]] or [[D|]] wl_type = 1; end D = mw.text.trim (D, '%s|'); -- trim white space and pipe characters return wl_type, D, L or ''; end --[[--------------------------< S A F E _ F O R _ U R L >------------------------------------------------------ Escape sequences for content that will be used for URL descriptions ]] local function safe_for_url( str ) if str:match( "%[%[.-%]%]" ) ~= nil then append_error( 'wikilink_in_url', {}); end return str:gsub( '[%[%]\n]', { ['['] = '[', [']'] = ']', ['\n'] = ' ' } ); end --[[--------------------------< C H E C K _ U R L >------------------------------------------------------------ Determines whether a URL string appears to be valid. First we test for space characters. If any are found, return false. Then split the url into scheme and domain portions, or for protocol relative (//example.com) urls, just the domain. Use is_url() to validate the two portions of the url. If both are valid, or for protocol relative if domain is valid, return true, else false. ]] local function check_url( url_str ) if nil == url_str:match ("^%S+$") then -- if there are any spaces in |url=value it can't be a proper url return false; end local scheme, domain; scheme, domain = split_url (url_str); -- get scheme or nil and domain or nil from url; return is_url (scheme, domain); -- return true if value appears to be a valid url end --[[--------------------------< M A K E _ E X T E R N A L _ L I N K >-------------------------------------------- Format an external link with error checking ]] local function make_external_link( URL, label, source ) local error_str = ""; if not is_set( label ) then label = URL; if is_set( source ) then error_str = set_error( 'bare_url_missing_title', source, false, " " ); else throw_error( 'bare_url_no_origin' ); end end if not check_url( URL ) then if is_set ( source ) then error_str = set_error( 'bad_url', source, false, " " ) .. error_str; else throw_error( 'bad_url_no_origin' ); end end return table.concat({ "[", URL, " ", safe_for_url( label ), "]", error_str }); end --[=[-------------------------< I S _ P A R A M E T E R _ E X T _ W I K I L I N K >---------------------------- Return true if a parameter value has a string that begins and ends with square brackets [ and ] and the first non-space characters following the opening bracket appear to be a url. The test will also find external wikilinks that use protocol relative urls. Also finds bare urls. The frontier pattern prevents a match on interwiki links which are similar to scheme:path urls. The tests that find bracketed urls are required because the parameters that call this test (currently |title=, |chapter=, |work=, and |publisher=) may have wikilinks and there are articles or redirects like '//Hus' so, while uncommon, |title=[[//Hus]] is possible as might be [[en://Hus]]. ]=] local function is_parameter_ext_wikilink (value) local scheme, domain; value = value:gsub ('([^%s/])/[%a%d].*', '%1'); -- strip path information (the capture prevents false replacement of '//') if value:match ('%f[%[]%[%a%S*:%S+.*%]') then -- if ext wikilink with scheme and domain: [xxxx://yyyyy.zzz] scheme, domain = value:match ('%f[%[]%[(%a%S*:)(%S+).*%]') elseif value:match ('%f[%[]%[//%S*%.%S+.*%]') then -- if protocol relative ext wikilink: [//yyyyy.zzz] domain = value:match ('%f[%[]%[//(%S*%.%S+).*%]'); elseif value:match ('%a%S*:%S+') then -- if bare url with scheme; may have leading or trailing plain text scheme, domain = value:match ('(%a%S*:)(%S+)'); elseif value:match ('//%S*%.%S+') then -- if protocol relative bare url: //yyyyy.zzz; may have leading or trailing plain text domain = value:match ('//(%S*%.%S+)'); -- what is left should be the domain else return false; -- didn't find anything that is obviously a url end return is_url (scheme, domain); -- return true if value appears to be a valid url end --[[-------------------------< C H E C K _ F O R _ U R L >----------------------------------------------------- loop through a list of parameters and their values. Look at the value and if it has an external link, emit an error message. ]] local function check_for_external_link (parameter_list) local error_message = ''; for k, v in pairs (parameter_list) do -- for each parameter in the list if is_parameter_ext_wikilink (v) then -- look at the value; if there is a url add an error message if is_set(error_message) then -- once we've added the first portion of the error message ... error_message=error_message .. ", "; -- ... add a comma space separator end error_message=error_message .. "|" .. k .. "="; -- add the failed parameter end end if is_set (error_message) then -- done looping, if there is an error message, display it append_error( 'param_has_ext_link', {error_message}); end end --[[--------------------------< S E T _ S E L E C T E D _ M O D U L E S >-------------------------------------- Import some functions from Module:Citation/CS1/Utilities and Module:Citation/CS1/Error ]] local function set_selected_modules (utilities_page_ptr, error_page_ptr) is_set = utilities_page_ptr.is_set; append_error = error_page_ptr.append_error; set_error = error_page_ptr.set_error; throw_error = error_page_ptr.throw_error; end --[[--------------------------< E X P O R T E D F U N C T I O N S >------------------------------------------ ]] return { check_for_external_link = check_for_external_link, -- exported functions make_external_link = make_external_link, make_internal_link = make_internal_link, remove_wiki_link = remove_wiki_link, set_selected_modules = set_selected_modules }
该页面使用的模板:
模块:Citation/CS1/Links/doc
(
查看源代码
)
返回
模块:Citation/CS1/Links
。
导航菜单
个人工具
创建账号
登录
命名空间
模块
讨论
English
查看
阅读
查看源代码
查看历史
更多
搜索
导航
首页
最近更改
随机页面
MediaWiki帮助
上传文件
工具
链入页面
相关更改
特殊页面
页面信息