トップ差分一覧ソース検索ヘルプ RSS ログイン

BugTrack-その他のメモ/37

SimpleParse

投稿者：みゅ
カテゴリ： Python
優先度：普通
状態：完了
日時： 2011年01月11日 09時50分06秒

内容

メモ

リンク

インストール

SimpleParse-2.1.1a2.tar.gzをダウンロード
解凍
python2.x setup.py install
- このままではmxtoolsがインストールされなかったので
python2.x setup.py install_lib

SimpleParse Grammars 日本語訳

SimpleParse Grammars

SimpleParse uses a particular EBNF grammar which reflects the current set of features in the system. Though the system is modular enough that you could replace that grammar, most users will simply want to use the provided grammar. This document provides a quick reference for the various features of the grammar with examples of use and descriptions of their effects.
- SimpleParseはシステムの現在の特徴を反映した独特なEBNFを用いています。仮にシステムがそれを文法に置き換えられるほど十分モジュール化されていたとしても、多くのユーザーは（単純に）与えられた文法を使いたいと考えるでしょう。このドキュメントは使える例題とそれらの効果の説明とともに文法のさまざまな特徴のクイックリッファレンスを与えています。
事前に必要なもの
- Python 2.x programmingの知識
- Some familiarity with EBNF grammars and other parsing terminology

An Example

以下は基本的なSimpleParse文法の例です

       declaration = r'''# note use of raw string when embedding in python code...
       file           :=  [ \t\n]*, section+
       section        :=  '[',identifier!,']'!, ts,'\n', body
       body           :=  statement*
       statement      :=  (ts,semicolon_comment)/equality/nullline
       nullline       :=  ts,'\n'
       comment        :=  -'\n'*
       equality       :=  ts, identifier,ts,'=',ts,identified,ts,'\n'
       identifier     :=  [a-zA-Z], [a-zA-Z0-9_]*
       identified     :=  string/number/identifier
       ts             :=  [ \t]*
       '''

You can see that the format allows for comments in Python style, and fairly free-form treatment of whitespace around the various items (i.e. “s:=x” and “s := x” are equivalent). The grammar is actually written such that you can break productions (rules) across multiple lines if that will make your grammar more readable. The grammar also allows both ':=' and '::=' for the "defined as" symbol.
- フォーマットはPythonスタイルを許容します。またホワイトスペースをきわめて柔軟に扱います（“s:=x” と “s := x” は等価です）。読みやすくしたければうんぬんかんぬん。

最小単位のトークン

Element tokens are the basic operational unit of the grammar. The concrete implementation of the various tokens is the module simpleparse.objectgenerator, their syntax is defined in the module simpleparse.simpleparsegrammar. You can read a formal definition of the grammar used to define them at the end of this document.
- 最小単位のトークンは文法の基本的な操作ユニットです。さまざまなトークンの具体的な実装はsimpleparse.objectgeneratorモジュールであり、それらの構文はsimpleparse.simpleparsegrammarモジュールで定義されています。このドキュメントの最後でその使われている定義を読むことができます。
Character Range

[ \t\n]
[a-zA-Z]
[a-zA-Z0-9_]

与えられた範囲内のいずれかの１文字

String Literal

“[“
'['
'\t'
'\xa0'

与えられた文字のシーケンス。特殊なまたは８進数または１６進数のエスケープ文字

Case-insensitive String Literal(new in 2.0.1a2)

c"this"
c'this'
c' this\t\n'
c'\xa0',

Character Classes, Strings and Escape Characters

Both character classes and strings in simpleparse may use octal escaping (of 1 to 3 octal digits), hexadecimal escaping (2 digits) or standard Python character escapes ( \a\b\f\n\r\t\v (Note that “u” and “U” are missing and that “\n” is interpreted according to the local machine, and thus may be non-portable, so an explicit declaration using hexadecimal code might be more suitable in cases where differentiation is required)). Strings may be either single or double quoted (but not triple quoted).
To include a "]" character in a character class, make it the first character of the class. Similarly, a literal "-" character must be either the first (after the optional "]" character) or the last character. The grammar definition for a character class is as follows:

'[', CHARBRACE?,CHARDASH?, (CHARRANGE/CHARNOBRACE)*, CHARDASH?, ']'

It is a common error to have declared something like [+-*] as a character range (every character including and between + and *) intending to specify [-+*] or [+*-] (three distinct characters). Symptoms include not matching '-' or matching characters that were not expected.

Modifiers/Operators

-"this" -those* -[A-Z]+ -(them,their) -(them/their)

Match a single character at the current position if the entire base element token doesn't match. If repeating, match any number of characters until the base element token matches.
すべてのベースとなる最小単位のトークンが合致しなければ、現在の位置において合致したことになる。繰り返しはベースとなる最小単位のトークンがマッチしなくなるまで。

Ｒ備忘録 /状態空間モデリング/donlp2/その他のメモ


		2025-7
日	月	火	水	木	金	土
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

BugTrack-その他のメモ/37

SimpleParse

内容

リンク

インストール

SimpleParse Grammars 日本語訳

SimpleParse Grammars

An Example

最小単位のトークン

Character Classes, Strings and Escape Characters

Modifiers/Operators

コメント

メニュー

サイト内検索

人気検索キーワード

人気の２０件

最新