README  2003/11/11

Oniguruma  ----   (C) K.Kosako <kosako@sofnec.co.jp>

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/oniguruma/
http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/oniguruma/

Oniguruma is a regular expression library.
The characteristics of this library is that different character encoding
for every regular expression object can be specified.
(Supported character encodings: ASCII, UTF-8, EUC-JP, Shift_JIS)

There are two ways of using of it in this program.

  * Built-in regular expression engine of Ruby
  * C library (supported APIs: GNU regex, POSIX, Oniguruma native)


Install

A. Install into Ruby

   See INSTALL-RUBY.


B. C library

 B1. Unix, Cygwin

   1. ./configure
   2. make
   3. make install

   (* uninstall:  make uninstall)

  * test (EUC-JP)
   4. make ctest


 B2. Win32 platform (VC++)

   1. copy win32\config.h config.h
   2. copy win32\Makefile Makefile
   3. nmake

        onig_s.lib:  static link library
        onig.dll:    dynamic link library

  * test (Shift_JIS)
   4. copy win32\testc.c testc.c
   5. nmake ctest



License

   When this software is partly used or it is distributed with Ruby, 
   this of Ruby follows the license of Ruby.
   It follows the BSD license in the case of the one except for it.


Source Files

  oniguruma.h    Oniguruma and GNU regex API header file
  regint.h       internal definitions
  regparse.h     internal definitions for regparse.c and regcomp.c
  regparse.c     parsing functions.
  regcomp.c      compiling and optimization functions
  regerror.c     error message function
  regex.c        source files wrapper for Ruby
  regexec.c      search and match functions

  reggnu.c       GNU regex API functions

  onigposix.h    POSIX API header file
  regposerr.c    POSIX API error message function (regerror)
  regposix.c     POSIX API functions

  sample/simple.c    example of the minimum (native API)
  sample/names.c     example of the named group callback.
  sample/listcap.c   example of the capture history.
  sample/posix.c     POSIX API sample.
  sample/sql.c       example of the variable meta characters.
                     (SQL-like pattern matching)


Regular expression

  See doc/RE (or doc/RE.ja for Japanese).



API differences with Japanized GNU regex(version 0.12) of Ruby

   + re_compile_fastmap() is removed.
   + re_recompile_pattern() is added.
   + re_alloc_pattern() is added.


ToDo

  1 support 16-bit and 31-bit encodings. (UCS-2, UCS-4, UTF-16)

  ? transmission stopper. (return REG_STOP from match_at())
  ? cut operator. (clear all current alternatives)
  ? /a{n}?/ should be interpreted as /(?:a{n})?/
  ? implement syntax behavior REG_SYN_CONTEXT_INDEP_ANCHORS.
  ? pattern encoding different with target.
    (ex. UCS-2 Big Endian and UCS-2 Little Endian)
  ? better acess to hash table.
    non null-terminated key version st_lookup().
  ? grep-like tool 'onigrep'.
  ? check invalid wide char value in WC2MB, WC2MB_FIRST on Ruby M17N.


and I'm thankful to Akinori MUSHA.
