Imported Upstream version 1.11.0

parents
*.gem
*.rbc
.bundle
.config
.yardoc
Gemfile.lock
InstalledFiles
_yardoc
coverage
doc/
lib/bundler/man
pkg
rdoc
spec/reports
test/tmp
test/version_tmp
tmp
exec/*
vendor/gems
\ No newline at end of file
language: ruby
before_install:
- sudo apt-get update -qq
- sudo apt-get install -qq libicu-dev
script: "bundle exec rake"
rvm:
- 1.9.2
- 1.9.3
- 2.0.0
- 2.1.1
- ree
matrix:
fast_finish: true
allow_failures:
- rvm: ree
# CHANGELOG
## 1.11.0
* Search for text nodes on DocumentFragments without root tags #146 Razer6
* Don't filter @mentions in <style> tags #145 jch
* Prefer `http_url` in HttpsFilter. `base_url` still works. #142 bkeepers
* Remove duplicate check in EmojiFilter #141 Razer6
## 1.10.0
* Anchor TOCFilter with id's instead of name's #140 bkeepers
* Add `details` to sanitization whitelist #139 tansaku
* Fix README spelling #137 Razer6
* Remove ActiveSupport `try` dependency #132 simeonwillbanks
## 1.9.0
* Generalize https filter with :base_url #124 #131 rymohr
* Clean up gemspec dependencies #130 mislav
* EmojiFilter compatibility with gemoji v2 #129 mislav
* Now using Minitest #126 simeonwillbanks
## 1.8.0
* Add custom path support for EmojiFilter #122 bradly
* Reorganize README and add table of contents #118 simeonwillbanks
## 1.7.0
* SanitizationFilter whitelists <s> and <strike> elements #120 charliesome
* ruby 2.1.1 support #119 simeonwillbanks
## 1.6.0
* Doc update for syntax highlighting #108 simeonwillbanks
* Add missing dependency for EmailReplyFilter #110 foca
* Fix deprecation warning for Digest::Digest #103 chrishunt
## 1.5.0
* More flexible whitelist configuration for SanitizationFilter #98 aroben
## 1.4.0
* Fix CamoFilter double entity encoding. #101 josh
## 1.3.0
1.2.0 didn't actually include the following changes. Yanked that release.
* CamoFilter now camos https images. #96 josh
## 1.1.0
* escape emoji filenames in urls #92 jayroh
## 1.0.0
To upgrade to this release, you will need to include separate gems for each of
the filters. See [this section of the README](/README.md#dependencies) for
details.
* filter dependencies are no longer included #80 from simeonwillbanks/simple-dependency-management
* Add link_attr option to Autolink filter #89 from excid3/master
* Add ActiveSupport back in as dependency for xml-mini #85 from mojavelinux/xml-mini
## 0.3.1
* Guard against nil node replacement in SyntaxHighlightFilter #84 jbarnette
## 0.3.0
* Add support for manually specified default language in SyntaxHighlightFilter #81 jbarnette
## 0.2.1
* Moves ActiveSupport as a development dependency #79
## 0.2.0
* Fix README typo #74 tricknotes
* TableOfContentsFilter generates list of sections #75 simeonwillbanks
## 0.1.0
I realized I wasn't properly following [semver](http://semver.org) for interface
changes and new features. Starting from this release, semver will be followed.
* Whitelist table section elements in sanitization filter #55 mojavelinux
* Update readme typo #57 envygeeks
* TOC unicode characters and anchor names for Ruby > 1.9 #64 jakedouglas/non_english_anchors
* Add :skip_tags option for AutolinkFilter #65 pengwynn
* Fix CI dependency issues #67 jch
* Fix ignored test and add Ruby 2.0 to CI. #71, #72 tricknotes
## 0.0.14
* Remove unused can_access_repo? method jch
## 0.0.13
* Update icon class name (only affects TOC pipeline) cameronmcefee #52
## 0.0.12
* add additional payload information for instrumentation mtodd #46
* generate and link to gem docs in README
## 0.0.11
* add instrumentation support. readme cleanup mtodd #45
## 0.0.10
* add bin/html-pipeline util indirect #44
* add result[:mentioned_usernames] for MentionFilter fachen #42
## 0.0.9
* bump escape_utils ~> 0.3, github-linguist ~> 2.6.2 brianmario #41
* remove nokogiri monkey patch for ruby >= 1.9 defunkt #40
## 0.0.8
* raise LoadError instead of printing to stderr if linguist is missing. gjtorikian #36
## 0.0.7
* optionally require github-linguist chrislloyd #33
## 0.0.6
* don't mutate markdown strings: jakedouglas #32
## 0.0.5
* fix li xss vulnerability in sanitization filter: vmg #31
* gemspec cleanup: nbibler #23, jbarnette #24
* doc updates: jch #16, pborreli #17, wickedshimmy #18, benubois #19, blackerby #21
* loosen gemoji dependency: josh #15
## 0.0.4
* initial public release
# Contributing
Thanks for using and improving `HTML::Pipeline`!
- [Submitting a New Issue](#submitting-a-new-issue)
- [Sending a Pull Request](#sending-a-pull-request)
## Submitting a New Issue
If there's an idea you'd like to propose, or a design change, feel free to file a new issue.
If you have an implementation question or believe you've found a bug, please provide as many details as possible:
- Input document
- Output HTML document
- the exact `HTML::Pipeline` code you are using
- output of the following from your project
```
ruby -v
bundle exec nokogiri -v
```
## Sending a Pull Request
[Pull requests][pr] are always welcome!
Check out [the project's issues list][issues] for ideas on what could be improved.
Before sending, please add tests and ensure the test suite passes.
### Running the Tests
To run the full suite:
`bundle exec rake`
To run a specific test file:
`bundle exec ruby -Itest test/html/pipeline_test.rb`
To run a specific test:
`bundle exec ruby -Itest test/html/pipeline/markdown_filter_test.rb -n test_disabling_gfm`
To run the full suite with all [supported rubies][travisyaml] in bash:
```bash
rubies=(ree-1.8.7-2011.03 1.9.2-p290 1.9.3-p429 2.0.0-p247)
for r in ${rubies[*]}
do
rbenv local $r # switch to your version manager of choice
bundle install
bundle exec rake
done
```
[issues]: https://github.com/jch/html-pipeline/issues
[pr]: https://help.github.com/articles/using-pull-requests
[travisyaml]: https://github.com/jch/html-pipeline/blob/master/.travis.yml
source "https://rubygems.org"
# Specify your gem's dependencies in html-pipeline.gemspec
gemspec
group :development do
gem "bundler"
gem "rake"
end
group :test do
gem "minitest", "~> 5.3"
gem "rinku", "~> 1.7", :require => false
gem "gemoji", "~> 1.0", :require => false
gem "RedCloth", "~> 4.2.9", :require => false
gem "github-markdown", "~> 0.5", :require => false
gem "email_reply_parser", "~> 0.5", :require => false
if RUBY_VERSION < "2.1.0"
gem "escape_utils", "~> 0.3", :require => false
gem "github-linguist", "~> 2.6.2", :require => false
else
gem "escape_utils", "~> 1.0", :require => false
gem "github-linguist", "~> 2.10", :require => false
end
if RUBY_VERSION < "1.9.2"
gem "sanitize", ">= 2", "< 2.0.4", :require => false
gem "nokogiri", ">= 1.4", "< 1.6"
else
gem "sanitize", "~> 2.0", :require => false
end
if RUBY_VERSION < "1.9.3"
gem "activesupport", ">= 2", "< 4"
end
end
Copyright (c) 2012 GitHub Inc. and Jerry Cheung
MIT License
Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
\ No newline at end of file
This diff is collapsed.
#!/usr/bin/env rake
require "bundler/gem_tasks"
require 'rake/testtask'
Rake::TestTask.new do |t|
t.libs << "test"
t.test_files = FileList['test/**/*_test.rb']
t.verbose = true
end
task :default => :test
\ No newline at end of file
#!/usr/bin/env ruby
require 'html/pipeline'
require 'optparse'
# Accept "help", too
ARGV.map!{|a| a == "help" ? "--help" : a }
OptionParser.new do |opts|
opts.banner = <<-HELP.gsub(/^ /, '')
Usage: html-pipeline [-h] [-f]
html-pipeline [FILTER [FILTER [...]]] < file.md
cat file.md | html-pipeline [FILTER [FILTER [...]]]
HELP
opts.separator "Options:"
opts.on("-f", "--filters", "List the available filters") do
filters = HTML::Pipeline.constants.grep(/\w+Filter$/).
map{|f| f.to_s.gsub(/Filter$/,'') }
# Text filter doesn't work, no call method
filters -= ["Text"]
abort <<-HELP.gsub(/^ /, '')
Available filters:
#{filters.join("\n ")}
HELP
end
end.parse!
# Default to a GitHub-ish pipeline
if ARGV.empty?
filters = [
HTML::Pipeline::MarkdownFilter,
HTML::Pipeline::SanitizationFilter,
HTML::Pipeline::ImageMaxWidthFilter,
HTML::Pipeline::EmojiFilter,
HTML::Pipeline::AutolinkFilter,
HTML::Pipeline::TableOfContentsFilter,
]
# Add syntax highlighting if linguist is present
begin
require 'linguist'
filters << HTML::Pipeline::SyntaxHighlightFilter
rescue LoadError
end
else
def filter_named(name)
case name
when "Text"
raise NameError # Text filter doesn't work, no call method
end
HTML::Pipeline.const_get("#{name}Filter")
rescue NameError => e
abort "Unknown filter '#{name}'. List filters with the -f option."
end
filters = []
until ARGV.empty?
name = ARGV.shift
filters << filter_named(name)
end
end
context = {
:asset_root => "/assets",
:base_url => "/",
:gfm => true
}
puts HTML::Pipeline.new(filters, context).call(ARGF.read)[:output]
# -*- encoding: utf-8 -*-
require File.expand_path("../lib/html/pipeline/version", __FILE__)
Gem::Specification.new do |gem|
gem.name = "html-pipeline"
gem.version = HTML::Pipeline::VERSION
gem.license = "MIT"
gem.authors = ["Ryan Tomayko", "Jerry Cheung"]
gem.email = ["ryan@github.com", "jerry@github.com"]
gem.description = %q{GitHub HTML processing filters and utilities}
gem.summary = %q{Helpers for processing content through a chain of filters}
gem.homepage = "https://github.com/jch/html-pipeline"
gem.files = `git ls-files`.split $/
gem.test_files = gem.files.grep(%r{^test})
gem.require_paths = ["lib"]
gem.add_dependency "nokogiri", "~> 1.4"
gem.add_dependency "activesupport", ">= 2"
gem.post_install_message = <<msg
-------------------------------------------------
Thank you for installing html-pipeline!
You must bundle Filter gem dependencies.
See html-pipeline README.md for more details.
https://github.com/jch/html-pipeline#dependencies
-------------------------------------------------
msg
end
require "nokogiri"
require "active_support/xml_mini/nokogiri" # convert Documents to hashes
module HTML
# GitHub HTML processing filters and utilities. This module includes a small
# framework for defining DOM based content filters and applying them to user
# provided content.
#
# See HTML::Pipeline::Filter for information on building filters.
#
# Construct a Pipeline for running multiple HTML filters. A pipeline is created once
# with one to many filters, and it then can be `call`ed many times over the course
# of its lifetime with input.
#
# filters - Array of Filter objects. Each must respond to call(doc,
# context) and return the modified DocumentFragment or a
# String containing HTML markup. Filters are performed in the
# order provided.
# default_context - The default context hash. Values specified here will be merged
# into values from the each individual pipeline run. Can NOT be
# nil. Default: empty Hash.
# result_class - The default Class of the result object for individual
# calls. Default: Hash. Protip: Pass in a Struct to get
# some semblance of type safety.
class Pipeline
autoload :VERSION, 'html/pipeline/version'
autoload :Filter, 'html/pipeline/filter'
autoload :AbsoluteSourceFilter, 'html/pipeline/absolute_source_filter'
autoload :BodyContent, 'html/pipeline/body_content'
autoload :AutolinkFilter, 'html/pipeline/autolink_filter'
autoload :CamoFilter, 'html/pipeline/camo_filter'
autoload :EmailReplyFilter, 'html/pipeline/email_reply_filter'
autoload :EmojiFilter, 'html/pipeline/emoji_filter'
autoload :HttpsFilter, 'html/pipeline/https_filter'
autoload :ImageMaxWidthFilter, 'html/pipeline/image_max_width_filter'
autoload :MarkdownFilter, 'html/pipeline/markdown_filter'
autoload :MentionFilter, 'html/pipeline/@mention_filter'
autoload :PlainTextInputFilter, 'html/pipeline/plain_text_input_filter'
autoload :SanitizationFilter, 'html/pipeline/sanitization_filter'
autoload :SyntaxHighlightFilter, 'html/pipeline/syntax_highlight_filter'
autoload :TextileFilter, 'html/pipeline/textile_filter'
autoload :TableOfContentsFilter, 'html/pipeline/toc_filter'
autoload :TextFilter, 'html/pipeline/text_filter'
# Our DOM implementation.
DocumentFragment = Nokogiri::HTML::DocumentFragment
# Parse a String into a DocumentFragment object. When a DocumentFragment is
# provided, return it verbatim.
def self.parse(document_or_html)
document_or_html ||= ''
if document_or_html.is_a?(String)
DocumentFragment.parse(document_or_html)
else
document_or_html
end
end
# Public: Returns an Array of Filter objects for this Pipeline.
attr_reader :filters
# Public: Instrumentation service for the pipeline.
# Set an ActiveSupport::Notifications compatible object to enable.
attr_accessor :instrumentation_service
# Public: String name for this Pipeline. Defaults to Class name.
attr_writer :instrumentation_name
def instrumentation_name
@instrumentation_name || self.class.name
end
class << self
# Public: Default instrumentation service for new pipeline objects.
attr_accessor :default_instrumentation_service
end
def initialize(filters, default_context = {}, result_class = nil)
raise ArgumentError, "default_context cannot be nil" if default_context.nil?
@filters = filters.flatten.freeze
@default_context = default_context.freeze
@result_class = result_class || Hash
@instrumentation_service = self.class.default_instrumentation_service
end
# Apply all filters in the pipeline to the given HTML.
#
# html - A String containing HTML or a DocumentFragment object.
# context - The context hash passed to each filter. See the Filter docs
# for more info on possible values. This object MUST NOT be modified
# in place by filters. Use the Result for passing state back.
# result - The result Hash passed to each filter for modification. This
# is where Filters store extracted information from the content.
#
# Returns the result Hash after being filtered by this Pipeline. Contains an
# :output key with the DocumentFragment or String HTML markup based on the
# output of the last filter in the pipeline.
def call(html, context = {}, result = nil)
context = @default_context.merge(context)
context = context.freeze
result ||= @result_class.new
payload = default_payload :filters => @filters.map(&:name),
:context => context, :result => result
instrument "call_pipeline.html_pipeline", payload do
result[:output] =
@filters.inject(html) do |doc, filter|
perform_filter(filter, doc, context, result)
end
end
result
end
# Internal: Applies a specific filter to the supplied doc.
#
# The filter is instrumented.
#
# Returns the result of the filter.
def perform_filter(filter, doc, context, result)
payload = default_payload :filter => filter.name,
:context => context, :result => result
instrument "call_filter.html_pipeline", payload do
filter.call(doc, context, result)
end
end
# Like call but guarantee the value returned is a DocumentFragment.
# Pipelines may return a DocumentFragment or a String. Callers that need a
# DocumentFragment should use this method.
def to_document(input, context = {}, result = nil)
result = call(input, context, result)
HTML::Pipeline.parse(result[:output])
end
# Like call but guarantee the value returned is a string of HTML markup.
def to_html(input, context = {}, result = nil)
result = call(input, context, result = nil)
output = result[:output]
if output.respond_to?(:to_html)
output.to_html
else
output.to_s
end
end
# Public: setup instrumentation for this pipeline.
#
# Returns nothing.
def setup_instrumentation(name = nil, service = nil)
self.instrumentation_name = name
self.instrumentation_service =
service || self.class.default_instrumentation_service
end
# Internal: if the `instrumentation_service` object is set, instruments the
# block, otherwise the block is ran without instrumentation.
#
# Returns the result of the provided block.
def instrument(event, payload = nil)
payload ||= default_payload
return yield(payload) unless instrumentation_service
instrumentation_service.instrument event, payload do |payload|
yield payload
end
end
# Internal: Default payload for instrumentation.
#
# Accepts a Hash of additional payload data to be merged.
#
# Returns a Hash.
def default_payload(payload = {})
{:pipeline => instrumentation_name}.merge(payload)
end
end
end
# XXX nokogiri monkey patches for 1.8
if not ''.respond_to?(:force_encoding)
class Nokogiri::XML::Node
# Work around an issue with utf-8 encoded data being erroneously converted to
# ... some other shit when replacing text nodes. See 'utf-8 output 2' in
# user_content_test.rb for details.
def replace_with_encoding_fix(replacement)
if replacement.respond_to?(:to_str)
replacement = document.fragment("<div>#{replacement}</div>").children.first.children
end
replace_without_encoding_fix(replacement)
end
alias_method :replace_without_encoding_fix, :replace
alias_method :replace, :replace_with_encoding_fix
def swap(replacement)
replace(replacement)
self
end
end
end
require 'set'
module HTML
class Pipeline
# HTML filter that replaces @user mentions with links. Mentions within <pre>,
# <code>, and <a> elements are ignored. Mentions that reference users that do
# not exist are ignored.
#
# Context options:
# :base_url - Used to construct links to user profile pages for each
# mention.
# :info_url - Used to link to "more info" when someone mentions @mention
# or @mentioned.
#
class MentionFilter < Filter
# Public: Find user @mentions in text. See
# MentionFilter#mention_link_filter.
#
# MentionFilter.mentioned_logins_in(text) do |match, login, is_mentioned|
# "<a href=...>#{login}</a>"
# end
#
# text - String text to search.
#
# Yields the String match, the String login name, and a Boolean determining
# if the match = "@mention[ed]". The yield's return replaces the match in
# the original text.
#
# Returns a String replaced with the return of the block.
def self.mentioned_logins_in(text)
text.gsub MentionPattern do |match|
login = $1
yield match, login, MentionLogins.include?(login.downcase)
end
end
# Pattern used to extract @mentions from text.
MentionPattern = /
(?:^|\W) # beginning of string or non-word char
@((?>[a-z0-9][a-z0-9-]*)) # @username
(?!\/) # without a trailing slash
(?=
\.+[ \t\W]| # dots followed by space or non-word character
\.+$| # dots at end of line
[^0-9a-zA-Z_.]| # non-word character except dot
$ # end of line
)
/ix
# List of username logins that, when mentioned, link to the blog post
# about @mentions instead of triggering a real mention.
MentionLogins = %w(
mention
mentions
mentioned
mentioning
)
# Don't look for mentions in text nodes that are children of these elements
IGNORE_PARENTS = %w(pre code a style).to_set
def call
result[:mentioned_usernames] ||= []
search_text_nodes(doc).each do |node|
content = node.to_html
next if !content.include?('@')
next if has_ancestor?(node, IGNORE_PARENTS)
html = mention_link_filter(content, base_url, info_url)
next if html == content
node.replace(html)
end
doc
end
# The URL to provide when someone @mentions a "mention" name, such as
# @mention or @mentioned, that will give them more info on mentions.
def info_url
context[:info_url] || nil
end
# Replace user @mentions in text with links to the mentioned user's
# profile page.
#
# text - String text to replace @mention usernames in.
# base_url - The base URL used to construct user profile URLs.
# info_url - The "more info" URL used to link to more info on @mentions.
# If nil we don't link @mention or @mentioned.
#
# Returns a string with @mentions replaced with links. All links have a
# 'user-mention' class name attached for styling.
def mention_link_filter(text, base_url='/', info_url=nil)
self.class.mentioned_logins_in(text) do |match, login, is_mentioned|
link =
if is_mentioned
link_to_mention_info(login, info_url)
else
link_to_mentioned_user(login)
end
link ? match.sub("@#{login}", link) : match
end
end
def link_to_mention_info(text, info_url=nil)
return "@#{text}" if info_url.nil?
"<a href='#{info_url}' class='user-mention'>" +
"@#{text}" +
"</a>"
end
def link_to_mentioned_user(login)