======================================================================
                             ChaSen 1.51
               Japanese Morphological Analysis System
                             Jul 29, 1997
======================================================================

0. files and directories

   README                this file
   CHANGES               changes
   lib/                  ChaSen library and its sources
   chasen/               ChaSen program and its sources
   mkchadic/             programs for making dictionaries and its sources
   prolog/               Prolog program to use ChaSen
   doc/                  manuals
   dic/                  dictionaries

1. install

   Suppose $CHASEN be the home directory of ChaSen system.

   1. Modify $CHASEN/Makefile as you like and type "make".  This
      produces all the necessary object programs.

   2. Type "make dic".  This produces the system dictionaries. 
      The compilation time of the current dictionary is about 3
      minutes on SUN SPARCstation20.  The size of the compiled
      dictionary (chadic.int and chadic.pat) is about 7.5Mbytes in
      total.

   3. Type "make install" to install programs. This does not install
      ChaSen library and Emacs Lisp version of ChaSen client.

   4. To use user specific chasenrc file, copy $CHASEN/chasenrc to the
      user's home directory as the name ".chasenrc" or ".jumanrc" and
      modify .  chasenrc so that it correctly indicates the
      directories for the files of grammar definition and
      dictionaries.

      The second line in .chasenrc defines the directory of the
      grammar files, the initial value of which is
      /usr/local/lib/chasen/dic.

      The fourth line in .chasenrc is for compatibility with JUMAN
      2.0.  The ChaSen system does not refer to this line.  Change the
      directory name for the JUMAN 2.0 dictionary directory if .
      jumanrc is to be shared with the JUMAN 2.0 system.

      The seventh line in .chasenrc defines the directory of the 
      dictionary files and their.

2. how to use ChaSen system

   Suppose a Japanese text file "nihongo," which should be encoded in
   Japanese EUC (Extended UNIX Code) or JIS (ISO-2022-JP). Issue the
   following command:

     chasen nihongo

   The result of the morphological analysis is shown on the standard
   output.  If your terminal has a direct input facility of Japanese
   characters, simply type 

     chasen

   then input a Japanese sentence followed by a carrige return.

   You can use ChaSen server and its client. First, type

     chasen -s

   to start ChaSen server. type

     chasen -Dhost nihongo

   (`host' should be the hostname of ChaSen server) to run ChaSen
   client.

   There are several options:

   (how to run)
     -s             start ChaSen server
     -P port        specify ChaSen server's port number
		    (use with -s, the default is 31000)
     -D host[:port] connect to ChaSen server
     -R             does not read chasenrc file (use with -D)
     -a             run standalone even if environment variable CHASENSERVER
                    is set
   (how to print ambiguous results)
     -b             print one result with the least cost (default)
     -m             print ambiguous parts explicit
     -p             print all possible results independently
   (output format)
     -f             print the result in a table like format (default)
     -e             print all information of each morpheme separated by a blank
     -c             print all information of each morpheme in internal codes
     -d             print detailed morpheme data for Prolog.
     -v             print detailed morpheme data for ViCha.
     -F             print morpheme data with formetted output
     -Fh            print help of the format of -F option
   (others)
     -j             Japanese sentence mode 
                    (assume a punctuation mark as a sentence delimiter)
     -w width       specify the cost width
     -C             use command mode
     -r rc_file     use rc_file as a chasenrc file other than the default
     -h             print the help message
     -V             print ChaSen version number

   For example, compare the default output with the results of the following.

       chasen -m -e nihongo

   Notes about -F option.

   format characters:
     %m     surface form (inflected form)
     %M     surface form (base form)
     %y     pronounciation
     %Y     reading (if the morpheme is undefined, print "̤")
     %rAB   surface form with ruby
     %i     semantic information
     %Ic    semantic information (if NIL, print character 'c'.)
     %h     part of speech (code)
     %H     part of speech (name)
     %b     sub-part of speech (code)
     %BB    sub-part of speech( name)(if not, print part of speech)
     %BM    sub-part of speech (name)(if not, print part of speech)
     %Bc    sub-part of speech (name)(if not, print character 'c')
     %t     inflection type (code)
     %Tc    inflection type (name)(if not, print character 'c')
     %f     inflected form (code)
     %Fc    inflected form (name)(if not,  print character 'c')
     %c     cost value of the morpheme
     %%     '%'
     .      specify the field width
     -      specify the field width
     1-9    specify the field width
     \n     carrige return
     \t     tab
     \\     back slash
     \'     single quotation mark
     \"     double quotation mark

   example:
     "%m %y %M %h %b %t %f\n"                - same as '-c' option
     "%m %Y %M %H %h %B* %b %T* %t %F* %f\n" - same as '-e' option

   You can also specify the output format by describing a parameter
   in .chasenrc. For example, if there is the following line in
   .chasenrc, only the surface form will be printed.

   (ϥեޥå "%m ")

   Note that -f, -e, -c, -d and -F override the format defined in 
   .chasenrc.

3. Emacs Lisp version of ChaSen client

   Copy $CHASEN/chasen/chasen.el to the Emacs Lisp directory to
   install. Specify hostname and port number of ChaSen server, and
   describe autoloaded functions in your .emacs.

     (setq chasen-server-host "kyusu")
     (setq chasen-server-port 31234)    ; the default is 31000

     (autoload 'chasen-region "chasen" "ChaSen client" t)
     (autoload 'chasen-line "chasen" "ChaSen client" t)
     (autoload 'chasen-highlight-class-region "chasen" "ChaSen client" t)
     (autoload 'chasen-property-class-region "chasen" "ChaSen client" t)

4. ChaSen library

   You can use ChaSen library to put ChaSen's module into other
   programs.  See $CHASEN/doc/manual.tex for details.

----------------------------------------------------------------------
For further information, send an email to:
	chasen@cactus.aist-nara.ac.jp
