Commit 817dff78 authored by Raphaël Hertzog's avatar Raphaël Hertzog

New upstream version 0.87+ds1

parent db74352b
......@@ -2,7 +2,7 @@ Please prefix your issue with one of the following: [BUG], [PROPOSAL], [QUESTION
CCExtractor version (using the --version parameter preferably) : **X.X**
**In raising this issue, I confirm the following (please check boxes, eg [X]):**
**In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):**
- [ ] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md).
- [ ] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
......@@ -11,7 +11,7 @@ CCExtractor version (using the --version parameter preferably) : **X.X**
- [ ] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
- [ ] I have used the latest available version of CCExtractor to verify this issue exists.
**My familiarity with the project is as follows (check one, eg [X]):**
**My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):**
- [ ] I have never used CCExtractor.
- [ ] I have used CCExtractor just a couple of times.
......@@ -21,9 +21,9 @@ CCExtractor version (using the --version parameter preferably) : **X.X**
**Necessary information**
- Is this a regression (did it work before)? [ ] NO | [ ] YES - *please specify the last known working version*
- What platform did you use? [ ] Windows - [ ] Linux - [ ] Mac
- What where the used arguments? `-autoprogram`
- What were the used arguments? `-autoprogram`
**Video links**
**Video links (replace text below with your links) **
Please make the affected input file available for us (no screenshots, those don't help!). Public links to Dropbox, Google Drive, etc, are all fine. If it is not possible to make it available publicly, send us a private invitation (both Dropbox and Google Drive allow that). In this case we will download the file and upload it to the private developer repository.
......
......@@ -40,6 +40,7 @@ build/
*.user
*.opendb
*.db
*.vscode
####
# Ignore the header file that is updated upon build
......@@ -51,6 +52,7 @@ windows/libs/tesseract/**
# Ctags
*.tags*
tags
# Vagrant
.vagrant/
......@@ -120,3 +122,14 @@ src/**/.deps
src/**/.dirstamp
mac/ccextractorGUI
linux/ccextractorGUI
linux/CMakeCache.txt
linux/CMakeFiles/
linux/cmake_install.cmake
linux/install_manifest.txt
linux/lib_ccx/
mac/lib_ccx/
mac/install_manifest.txt
mac/cmake_install.cmake
mac/CMakeFiles/
mac/CMakeCache.txt
*.py.bak
......@@ -10,17 +10,16 @@ compiler:
- gcc
- clang
install:
- if [[ $TRAVIS_OS_NAME == 'osx' ]]; then brew install pkg-config autoconf automake libtool tesseract leptonica; fi
before_install:
- if [[ $TRAVIS_OS_NAME == 'osx' ]]; then brew upgrade automake; brew install pkg-config autoconf automake libtool tesseract leptonica; fi
- if [[ $TRAVIS_OS_NAME == 'linux' ]]; then sudo apt-get install -y libcurl4-gnutls-dev tesseract-ocr tesseract-ocr-dev libleptonica-dev autoconf-archive; fi
- if [[ $TRAVIS_OS_NAME == 'linux' ]]; then sudo apt-get install python-distutils-extra tesseract-ocr tesseract-ocr-eng libopencv-dev libtesseract-dev libleptonica-dev python-all-dev swig libcv-dev python-opencv python-numpy python-setuptools build-essential subversion; fi
- if [[ $TRAVIS_OS_NAME == 'linux' ]]; then wget https://github.com/DanBloomberg/leptonica/releases/download/1.74.4/leptonica-1.74.4.tar.gz && tar xvf leptonica-1.74.4.tar.gz; cd leptonica-1.74.4; ./configure && make && sudo make install; fi
- if [[ $TRAVIS_OS_NAME == 'linux' ]]; then wget https://github.com/DanBloomberg/leptonica/releases/download/1.74.4/leptonica-1.74.4.tar.gz && tar xvf leptonica-1.74.4.tar.gz; cd leptonica-1.74.4; ./configure && make && sudo make install; cd ..; fi
- if [[ $TRAVIS_OS_NAME == 'linux' ]]; then git clone https://github.com/tesseract-ocr/tesseract.git; cd tesseract; ./autogen.sh; ./configure --enable-debug; LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make; sudo make install; sudo ldconfig; cd ..; fi
script:
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then cd mac; ./build.command; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then cd mac; ./autogen.sh; ./configure; make; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then cd mac; ./build.command; cd ..; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then cd mac; ./autogen.sh; ./configure; make; cd ..; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then cd linux; ./build; cd ..; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then mkdir build; cd build; cmake ../src/; make; cd ..; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then cd linux; ./autogen.sh; ./configure; make; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then cd linux; ./autogen.sh; ./configure; make; cd ..; fi
This diff is collapsed.
![logo](https://avatars3.githubusercontent.com/u/7253637?v=3&s=100)
<img src ="https://github.com/CCExtractor/ccextractor-org-media/blob/master/static/ccx_logo_transparent_800x600.png" width="200px" alt="logo" />
# CCExtractor
[![Build Status](https://travis-ci.org/CCExtractor/ccextractor.svg?branch=master)](https://travis-ci.org/CCExtractor/ccextractor)
[![Sample-Platform Build Status Windows](https://sampleplatform.ccextractor.org/static/img/status/build-windows.svg?maxAge=1800)](https://sampleplatform.ccextractor.org/test/master/windows)
[![Sample-Platform Build Status Linux](https://sampleplatform.ccextractor.org/static/img/status/build-linux.svg?maxAge=1800)](https://sampleplatform.ccextractor.org/test/master/linux)
[![SourceForge](https://img.shields.io/badge/SourceForge%20downloads-213k%2Ftotal-brightgreen.svg)](https://sourceforge.net/projects/ccextractor/)
CCExtractor is a tool used to produce subtitles for TV recordings from almost anywhere in the world. We intend to keep up with all sources and formats.
......@@ -21,18 +24,16 @@ The official repository is ([CCExtractor/ccextractor](https://github.com/CCExtra
The core functionality is written in C. Other languages used include C++ and Python.
## Google Code-in 2017
## Google Code-in 2018
CCExtractor is [participating in Google Code-in 2017!](https://ccextractor.org/public:codein:welcome_2017)
CCExtractor is participating in Google Code-in 2018!
Google Code-In is a competition of encouraging young people to learn more about Open Source and contributing to it. Tasks range from coding, documentation, quality assurance, user interface, outreach and research.
Google Code-in is a competition of encouraging young people to learn more about Open Source and contributing to it. Tasks range from coding, documentation, quality assurance, design, outreach and research.
This is our second year of challenging tasks for pre-university students aged 13-17.
This is our third year of participating and creating challenging tasks for pre-university students aged 13-17.
If you are a student fitting the age criteria, interested to contribute to CCExtractor feel free to join us. You can read more at the [Google Code-in website](https://codein.withgoogle.com).
If you're interested in design tasks, you should read [this](https://www.ccextractor.org/public:codein:google_code-in_2017_code-in_for_designers) first.
## Installation and Usage
Downloads for precompiled binaries and source code can be found [on our website](https://www.ccextractor.org?id=public:general:downloads).
......@@ -56,13 +57,13 @@ To learn more about how to compile and build CCExtractor for your platform check
## Support
By far the best way to get support is by opening an issue at our [issue tracker](https://github.com/CCExtractor/ccextractor/issues).
By far the best way to get support is by opening an issue at our [issue tracker](https://github.com/CCExtractor/ccextractor/issues).
When you create a new issue, please fill in the needed details in the provided template. That makes it easier for us to help you more efficiently.
If you have a question or a problem you can also [contact us by email or chat with the team in Slack](https://www.ccextractor.org/doku.php?id=public:general:support).
If you have a question or a problem you can also [contact us by email or chat with the team in Slack](https://www.ccextractor.org/doku.php?id=public:general:support).
If you want to contribute to CCExtractor but can't submit some code patches or issues or video samples, you can also [donate to us](https://www.ccextractor.org/public:general:http:sourceforge.net_donate_index.php?group_id=190832)
If you want to contribute to CCExtractor but can't submit some code patches or issues or video samples, you can also [donate to us](https://www.ccextractor.org/public:general:http:sourceforge.net_donate_index.php?group_id=190832)
## Contributing
......@@ -70,10 +71,10 @@ You can contribute to the project by reporting issues, forking it, modifying the
## News & Other Information
News about releases and modifications to the code can be found in the [CHANGES.TXT](docs/CHANGES.TXT) file.
News about releases and modifications to the code can be found in the [CHANGES.TXT](docs/CHANGES.TXT) file.
For more information visit the CCExtractor website: [https://www.ccextractor.org](https://www.ccextractor.org)
## License
GNU General Public License version 2.0 (GPL-2.0)
\ No newline at end of file
GNU General Public License version 2.0 (GPL-2.0)
from builtins import str
import ccextractor as cc
import ccx_to_python_g608 as g608
import python_srt_generator as srt_generator
......
from __future__ import print_function
###
#MANDATORY UPDATES IN EVERY PYTHON SCRIPT
###
import ccextractor as cc
import api_support
from multiprocessing import Queue,Process,Event
import sys
import time
def templer():
s = cc.api_init_options()
cc.check_configuration_file(s)
for i in sys.argv[1:]:
cc.api_add_param(s,str(i))
#very mandatory for keeping a track of pythonapi call. Always must be set.
cc.my_pythonapi(s, callback)
compile_ret = cc.compile_params(s,len(sys.argv[1:]));
#very mandatory for keeping a track of pythonapi call. Always must be called so that the program knows that the call is from pythonapi.
cc.call_from_python_api(s)
start_ret = cc.api_start(s);
import ccextractor as cc
def callback(line, encoding):
api_support.generate_output_srt(line, str(encoding))
print(line)
def init_ccextractor(callback):
"""
:param callback: The callback which we use to handle
the extracted subtitle info
:return return the initialized options
"""
optionos = cc.api_init_options()
cc.check_configuration_file(optionos)
for arg in sys.argv[1:]:
cc.api_add_param(optionos, arg)
compile_ret = cc.compile_params(optionos, len(sys.argv[1:]))
# use my_pythonapi to add callback in C source code
cc.my_pythonapi(optionos, callback)
return optionos
def main():
options = init_ccextractor(callback)
cc.api_start(options)
if __name__=="__main__":
templer()
main()
#!/bin/bash
BLD_FLAGS="-std=gnu99 -Wno-write-strings -DGPAC_CONFIG_LINUX -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DENABLE_OCR"
BLD_INCLUDE="-I /usr/include/python2.7/ -I../src -I /usr/include/leptonica/ -I /usr/include/tesseract/ -I../src/lib_ccx/ -I../src/gpacmp4/ -I../src/libpng/ -I../src/zlib/ -I../src/zvbi -I../src/lib_hash -I../src/protobuf-c -I../src/utf8proc"
WRAPPER_FLAGS="-Wl,-wrap,write"
BLD_FLAGS="-std=gnu99 -Wno-write-strings -DGPAC_CONFIG_LINUX -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DENABLE_OCR -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DPYTHON_API"
BLD_INCLUDE="-I/usr/include/python2.7/ -I../src -I /usr/include/leptonica/ -I /usr/include/tesseract/ -I../src/lib_ccx/ -I../src/gpacmp4/ -I../src/libpng/ -I../src/zlib/ -I../src/zvbi -I../src/lib_hash -I../src/protobuf-c -I../src/utf8proc -I../src/freetype/include"
SRC_LIBPNG="$(find ../src/libpng/ -name '*.c')"
SRC_ZLIB="$(find ../src/zlib/ -name '*.c')"
SRC_ZVBI="$(find ../src/zvbi/ -name '*.c')"
......@@ -10,14 +9,59 @@ SRC_GPAC="$(find ../src/gpacmp4/ -name '*.c')"
SRC_HASH="$(find ../src/lib_hash/ -name '*.c')"
SRC_PROTOBUF="$(find ../src/protobuf-c/ -name '*.c')"
SRC_UTF8PROC="../src/utf8proc/utf8proc.c"
SRC_FREETYPE="../src/freetype/autofit/autofit.c
../src/freetype/base/ftbase.c
../src/freetype/base/ftbbox.c
../src/freetype/base/ftbdf.c
../src/freetype/base/ftbitmap.c
../src/freetype/base/ftcid.c
../src/freetype/base/ftfntfmt.c
../src/freetype/base/ftfstype.c
../src/freetype/base/ftgasp.c
../src/freetype/base/ftglyph.c
../src/freetype/base/ftgxval.c
../src/freetype/base/ftinit.c
../src/freetype/base/ftlcdfil.c
../src/freetype/base/ftmm.c
../src/freetype/base/ftotval.c
../src/freetype/base/ftpatent.c
../src/freetype/base/ftpfr.c
../src/freetype/base/ftstroke.c
../src/freetype/base/ftsynth.c
../src/freetype/base/ftsystem.c
../src/freetype/base/fttype1.c
../src/freetype/base/ftwinfnt.c
../src/freetype/bdf/bdf.c
../src/freetype/bzip2/ftbzip2.c
../src/freetype/cache/ftcache.c
../src/freetype/cff/cff.c
../src/freetype/cid/type1cid.c
../src/freetype/gzip/ftgzip.c
../src/freetype/lzw/ftlzw.c
../src/freetype/pcf/pcf.c
../src/freetype/pfr/pfr.c
../src/freetype/psaux/psaux.c
../src/freetype/pshinter/pshinter.c
../src/freetype/psnames/psnames.c
../src/freetype/raster/raster.c
../src/freetype/sfnt/sfnt.c
../src/freetype/smooth/smooth.c
../src/freetype/truetype/truetype.c
../src/freetype/type1/type1.c
../src/freetype/type42/type42.c
../src/freetype/winfonts/winfnt.c"
API_WRAPPERS="$(find ../src/wrappers/ -name '*.c')"
API_EXTRACTORS="$(find ../src/extractors/ -name '*.c')"
BLD_SOURCES="../src/ccextractor.c ccextractor_wrap.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_ZVBI $SRC_LIBPNG $SRC_HASH $SRC_PROTOBUF $SRC_UTF8PROC $API_WRAPPERS $API_EXTRACTORS"
# the `swig -python ccextractor.i` will generate ccextractor_wrap.c
BLD_SOURCES="../src/ccextractor.c ccextractor_wrap.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_ZVBI $SRC_LIBPNG $SRC_HASH $SRC_PROTOBUF $SRC_UTF8PROC $API_WRAPPERS $SRC_FREETYPE"
BLD_LINKER="-lm -zmuldefs -l tesseract -l lept -l python3.6m"
echo "Running pre-build script..."
../linux/pre-build.sh
echo "Trying to compile..."
out=$((LC_ALL=C gcc -fPIC -c -DPYTHONAPI $BLD_FLAGS $BLD_INCLUDE $BLD_SOURCES $BLD_LINKER) 2>&1)
#out=$((LC_ALL=C gcc -fPIC -c -Wl,-wrap,write $BLD_INCLUDE $BLD_SOURCES) 2>&1)
out=$((swig -python ccextractor.i && LC_ALL=C gcc -fPIC -c $BLD_FLAGS $BLD_INCLUDE $BLD_SOURCES $BLD_LINKER)2>&1)
res=$?
if [[ $out == *"gcc: command not found"* ]]
then
......@@ -45,4 +89,9 @@ then
>&2 echo "$out"
exit 5
fi
echo "Compilation successful";
if [[ "$out" != "" ]] ; then
echo "$out"
echo "Compilation successful, compiler message shown in previous lines"
else
echo "Compilation successful, no compiler messages."
fi
#!/bin/bash
BLD_FLAGS="-std=gnu99 -Wno-write-strings -DGPAC_CONFIG_LINUX -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DENABLE_OCR"
BLD_INCLUDE="-I /usr/include/python2.7/ -I../src -I /usr/include/leptonica/ -I /usr/include/tesseract/ -I../src/lib_ccx/ -I../src/gpacmp4/ -I../src/libpng/ -I../src/zlib/ -I../src/zvbi -I../src/lib_hash -I../src/protobuf-c -I../src/utf8proc"
SRC_LIBPNG="$(find ../src/libpng/ -name '*.c')"
SRC_ZLIB="$(find ../src/zlib/ -name '*.c')"
SRC_ZVBI="$(find ../src/zvbi/ -name '*.c')"
SRC_CCX="$(find ../src/lib_ccx/ -name '*.c')"
SRC_GPAC="$(find ../src/gpacmp4/ -name '*.c')"
SRC_HASH="$(find ../src/lib_hash/ -name '*.c')"
SRC_PROTOBUF="$(find ../src/protobuf-c/ -name '*.c')"
SRC_UTF8PROC="../src/utf8proc/utf8proc.c"
BLD_SOURCES="../src/ccextractor.c ../src/ccextractorapi.c ccextractorapi_wrap.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_ZVBI $SRC_LIBPNG $SRC_HASH $SRC_PROTOBUF $SRC_UTF8PROC"
BLD_LINKER="-lm -pthread -zmuldefs -l tesseract -l lept -l python2.7"
./pre-build.sh
out=$((LC_ALL=C gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractorapi -g $BLD_SOURCES $BLD_LINKER) 2>&1)
res=$?
if [[ $out == *"gcc: command not found"* ]]
then
echo "Error: please install gcc";
exit 1
fi
if [[ $out == *"curl.h: No such file or directory"* ]]
then
echo "Error: please install curl development library (libcurl4-gnutls-dev for Debian/Ubuntu)";
exit 2
fi
if [[ $out == *"capi.h: No such file or directory"* ]]
then
echo "Error: please install tesseract development library (tesseract-ocr-dev for Debian/Ubuntu)";
exit 3
fi
if [[ $out == *"allheaders.h: No such file or directory"* ]]
then
echo "Error: please install leptonica development library (libleptonica-dev for Debian/Ubuntu)";
exit 4
fi
if [[ $res -ne 0 ]] # Unknown error
then
echo "Compiled with errors"
>&2 echo "$out"
exit 5
fi
echo "Compilation successful";
......@@ -2,8 +2,7 @@
BLD_LINKER="-lm -zmuldefs -l tesseract -l lept -l python2.7"
WRAPPER_FLAGS="-Wl,-wrap,write"
out=$((swig -python ccextractor.i && ./build_api && gcc -shared $(find -name '*.o') -o _ccextractor.so $BLD_LINKER) 2>&1)
#out=$((swig -python ccextractor.i && ./build_api && gcc -shared $WRAPPER_FLAGS $(find -name '*.o') -o _ccextractor.so ) 2>&1)
out=$((./build_api && gcc -shared $(find -name '*.o') -o _ccextractor.so $BLD_LINKER)2>&1)
res=$?
if [[ $out == *"gcc: command not found"* ]]
then
......
......@@ -7,72 +7,12 @@
#include "../src/lib_ccx/ccx_mp4.h"
#include "../src/lib_ccx/hardsubx.h"
#include "../src/lib_ccx/ccx_share.h"
#include "../src/ccextractor.h"
#include "../src/wrappers/wrapper.h"
#include "../src/ccextractor.h"
#include "../src/wrappers/wrapper.h"
%}
void my_pythonapi(struct ccx_s_options *api_options, PyObject* func);
%pythoncode %{
def g608_grid_former(line,text,color,font):
if "text[" in line:
line = str(line.split(":", 1)[1])
line = str(line.split("\n")[0])
text.append(line)
if "color[" in line:
line = str(line.split(":", 1)[1])
line = str(line.split("\n")[0])
color.append(line)
if "font[" in line:
line = str(line.split(":", 1)[1])
line = str(line.split("\n")[0])
font.append(line)
def print_g608_grid(case,text,color,font):
help_string = """
Case is the value that would give the desired output.
case = 0 --> print start_time,end_time,text,color,font
case = 1 --> print start_time,end_time,text
case = 2 --> print start_time,end_time,color
case = 3 --> print start_time,end_time,font
case = 4 --> print start_time,end_time,text,color
case = 5 --> print start_time,end_time,text,font
case = 6 --> print start_time,end_time,color,font
"""
if case==0:
if text:
print "\n".join(text)
if color:
print "\n".join(color)
if font:
print "\n".join(font)
elif case==1:
if text:
print "\n".join(text)
elif case==2:
if color:
print "\n".join(color)
elif case==3:
if font:
print "\n".join(font)
elif case==4:
if text:
print "\n".join(text)
if color:
print "\n".join(color)
elif case==5:
if text:
print "\n".join(text)
if font:
print "\n".join(font)
elif case==6:
if color:
print "\n".join(color)
if font:
print "\n".join(font)
else:
print help_string
%}
%include "../src/lib_ccx/ccx_common_common.h"
%include "../src/ccextractor.h"
%include "../src//wrappers/wrapper.h"
struct ccx_s_options* api_init_options();
void check_configuration_file(struct ccx_s_options api_options);
int compile_params(struct ccx_s_options *api_options,int argc);
void api_add_param(struct ccx_s_options* api_options,char* arg);
int api_start(struct ccx_s_options api_options);
void my_pythonapi(struct ccx_s_options *api_options, PyObject *func);
This diff is collapsed.
This diff is collapsed.
from __future__ import print_function
from builtins import str
def g608_grid_former(line,text,color,font):
if "text[" in line:
line = str(line.split(":", 1)[1])
......@@ -57,5 +59,5 @@ def return_g608_grid(case,text,color,font):
if font:
ret_val['font']=font
else:
print help_string
print(help_string)
return ret_val
from __future__ import print_function
from builtins import zip
from builtins import str
import ccextractor as cc
import re
"""
......@@ -119,23 +122,23 @@ def comparing_text_font_grids(text, font, color):
if not i[1]:
final.append(i[0])
else:
print "error"
print("error")
return (final,font,color)
def generate_output_srt(filename,d, encoding):
if encoding in encodings_map.keys():
if encoding in list(encodings_map.keys()):
if encoding!='0':
encoding_format = encodings_map[encoding]
else:
encoding_format = ""
else:
print "encoding error in python"
print("encoding error in python")
return
if encoding_format:
d['text'] = [unicode(item,encoding_format) for item in d['text']]
d['text'] = [str(item,encoding_format) for item in d['text']]
else:
d['text'] = [unicode(item) for item in d['text']]
d['text'] = [str(item) for item in d['text']]
d['text'],d['font'],d['color']= comparing_text_font_grids(d['text'],d['font'],d['color'])
for item in d['text']:
if item.count(" ")<32:
......
from __future__ import print_function
import sys
import os
import subprocess
......@@ -6,11 +7,11 @@ output_formats = ['.srt','.ass','.ssa','.webvtt','.sami','.txt','.original','.py
args_list = sys.argv[1:]
args_count = len(args_list)
if args_count>1:
print "wrong usage"
print("wrong usage")
exit(0)
directory = args_list[0]
if not os.path.isdir(directory):
print "error: path given is not a directory"
print("error: path given is not a directory")
exit(0)
files = []
for item in os.listdir(directory):
......@@ -18,8 +19,8 @@ for item in os.listdir(directory):
if ext not in output_formats:
files.append(os.path.join(directory,item))
for sample in files:
print "Processing file: "+sample
print("Processing file: "+sample)
#command=['../linux/ccextractor',sample]
command = ['python','api_testing.py',sample]
subprocess.call(command)
print "Finished processing file: "+sample
print("Finished processing file: "+sample)
ccextractor was originally a mildly optimized C port of McPoodle's excellent
but painfully slow Perl script SCC_RIP. That port (ccextractor 0.01) was
written by Carlos Fernández (cfsmp3).
After a number of versions that did something semiuseful Volker Quetschke
joined the effort and together Carlos and Volker to CCExtractor a point in
which it was actually really usable, at least for the cases that interested
them.
Unfortunately Volker moved on once CCExtractor did what he needed to do for
him.
At some point David Liontooth from UCLA started to use CCExtractor as a
replacement for libzvbi because libzvbi wasn't working for some specific
streams. UCLA became the primary key user as they were using CCExtractor
24x7 to process a huge amount of stream from several countries, and was
therefore able to provide samples, proper bug reports, etc.
At that time CCEXtractor was still US-centric, because it was originally
written so Carlos could get subtitles for US TV shows. But UCLA wanted
European subtitles too, and they already had recording nodes in Denmark
(which use teletext) and Spain (which uses DVB).
For teletext a good solution existed already: Petr Kutalek's telxcc.
We contacted Petr and asked for permission to integrate his code into
CCExtractor. Petr's absolutely brilliantly clean code was easy to
integrate and build upon - and with it, we added support for the first
kind of European subtitles.
Around that time, we decided to apply for Google Summer of Code. That
was also a game changer, with Willem, Ruslan and Anshul being the first
3 students. They are still around, now as mentors and year round
contributors.
Since them, many more people have been involved: More than 10 as
Google Summer of Code students, Code-In students, companies that
sponsored development by hiring team members to do custom development
(Comcast was the first one, and we'll always be grateful for the
opportunity).
List of students is below (if they added themselves). For a complete
list, just check the pull requests at GitHub.
Home: https://www.ccextractor.org
Google Summer of Code 2014 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Anshul Maheshwari
Google Summer of Code 2015 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Anshul Maheshwari
- Nurendra Choudhary
- Oleg Kiselev
- Vasanth Kalingeri
Google Summer of Code 2016 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Abhishek Vinjamoori
- Abhinav Shukla
- Rishabh Garg
Google Code-in 2016 students
- Evgeny Shulgin
- Manveer Basra
- Alexandru Bratosin
- Matej Plavevski
- Danila Fedorin
Google Code-in 2017 students
- Matej Plavevski
- Harry Yu
- Theodore Fabian
- Nikunj Taneja
- John Chew
- Aadi Bajpai
- Wiliam(Hori75)
Google Summer of Code 2017 students
- Diptanshu Jamgade
- Mayank Gupta
0.86 (2017-01-09)
0.87 (2018-10-23)
-----------------
- New: Upgrade libGPAC to 0.7.1.
- New: mp4 tx3g & multitrack subtitles.
- New: Guide to update dependencies (docs/Updating_Dependencies.txt).
- New: Add LICENSE File (#959).
- New: Display quantisation mode in info box (#954).
- New: Add instruction required to build ccextractor with HARDSUBX support (#946).
- New: Added version no. of libraries to --version.
- New: Added -quant (OCR quantization function).
- New: Python API now compatible with Python 3.
- Fix: linux/builddebug: Added non-local directories to the incluye search path so we don't
require a locally compiled tesseract or leptonica.
- Fix: Correct -HARDSUBX Bug In CMake, allow build with hardsubx using cmake (#966).
- Fix: possible segfaults in hardsubx_classifier.c due to strdup (#963).
- Fix: Improve the start and end timestamps of extracted burned in captions (#962).
- Fix: Update COMPILATION.md (#960).
- Fix: Fixed crash with "-out=report" and "-out=null".
- Fix: -nocf not working with OCR'ing (#958).
- Fix: segfault in add_cc_sub_text and initialize to NULL in init_encoder (#950).
- Fix: ccx_decoders_common.c: Copy data type when creating a copy of the subtitle structure.
- Fix: Implicit declaration of these functions throws warning during build (#948).
- Fix: ccx_decoders_common.c: Properly release allocated resources on free_subtitle().
- Fix: Added a datatype member to struct cc_subtitle - needed so we can properly free all
memory when void *data points to a structure that has its own pointers.
- Fix: dvb_subtitle_decoder.c: When combining image regions verify that the offset is
never negative.
- Fix: Updated traivis.yml to fix osx build (#947).
- Fix: Add utf8proc src file to cmake, updated header file (#944).
- Fix: Added required pointers on freep() calls.
- Fix: Removed dvb_debug_traces_to_stdout and used the usual dbg_print instead.
- Fix: Additional debug traces for DVB.
- Fix: Fix minor memory leak in ocr.c.
- Fix: Fix issue with displaying utf8proc version.
- Fix: Fix failing cmake due to liblept/tesseract header files.
- Fix: Added missing \n in params.c.
- Fix: builddebug: Use -fsanitize=address -fno-omit-frame-pointer.
- Fix: ccx_decoders_common.c: Removed trivial memory leak.
- Fix: ccx_encoders_srt.c: Made sure a pointer is non-NULL before dereferencing.
- Fix: dvb_subtitle_decoder.c: Initialize pointer members to NULL when creating a structure.
- Fix: lib_ccx.c: Initialize (memset 0) structure cc_subtitle after memory allocation.
- Fix: Added verboseness to error/warnings in dvb_subtitle_decoder.c.
- Fix: dvb_subtitle_decoder.c: Work on passing invalid streams errors upstream (plus some
warning messages) so we can eventually recover from this situation instead of crashing.
- Fix: telxcc.c: Currently setting a colour doesn't necessarily add a space even though the
specifications mandate it. (#930).
- Fix: dvb_subtitle_decoder.c: Fix null pointer derefence when region==NULL in write_dvb_sub.
- Fix: DVB Teletext subtitle incomplete.
- Fix: replace all 0xA characters within startbox with 0x20.
- Fix: DVB Teletext subtitle incomplete (#922).
- Fix: Add missing return value to one of the returns in process_tx3g().
- Fix: Typos and other minor bugs.
- Fix: Tidy CMakeLists & vcxproj (#920).
- Fix: Added m2ts and -mxf to help screen.
- Fix: Added MKV to demuxer_print_cfg.
- Fix: Added MXF to demuxer_print_cfg.
- Fix: "Out of order packets" error had wrong print() parameters.
- Fix: Updated Python documentation.
- Fix: Fix incorrect path in XML (#904).
- Fix: linux build script (non-debug): Don't hide warnings from compiler.
- Fix: linux build script (debug): Display what's step of the build script we're in.
- Fix: Make the build reproducible (#976).
- Fix: Remove instance of o1 and o2 from help.
- Fix: Colors of DVB subtitles with depth 2 broken due to a missing break.
- Fix: CEA-708: Caption loss due to CW command (#991).
- Fix: CEA-708: Update patch for windows priority with functions (#990).
0.86 (2018-01-09)
-----------------
- New: Preliminary MXF support
- New: Added a histogram in one-minute increments of the number of lines in a subtitle.
......
......@@ -51,6 +51,9 @@ make
# test your build
./ccextractor
# make build systemwide
sudo make install
```
**Using CMake**
......@@ -69,6 +72,9 @@ make
# test your build
./ccextractor
# make build systemwide
sudo make install
```