The tool fslex.exe is a standard lexer-generator along the lines of flex and ocamllex. The input specifies a set of finite state machines that consume ASCII input, recognizing the longest possible match according to a set of regular expressions, consuming this input and evaluating a corresponding expression, which may in turn involve a transition to another state and recursive invocations of the lexical analysis engine.

Comments. Comments are "(*" and "*)" as well as C#/C++-style "//" comments-to-end-of-line.

Keywords. The keywords and reserved symbols of fslex are.

let 

Operators and Symbolic Keywords.

Identifiers.

Identifiers follow the following specification:

    letter-char := [ A-Z a-z ]
    ident-start-char := 
      | letter-char
      | unicode-letter-char 
      | _ 
      
    ident-char :=
      | letter-or-digit-char
      | unicode-letter-or-digit-char
      | _ 
      | '

    ident := ident-start-char ident-char*

Unicode characters include those within the standard ranges. All input files are currently assumed to be utf-8 encoded. See the C# specification for the definition of unicode characters accepted for the above classes of characters.

Strings and characters.

The treatment of strings and characters follows that of OCaml. In addition, unicode characters in utf-8 encoded files may be embedded in strings, as in identifiers (see above), as may trigraph-like specifications of unicode characters in an identical manner to C#:

    unicodegraph-short = '\\' 'u' hex hex hex hex
    unicodegraph-long =  '\\' 'U' hex hex hex hex hex hex hex hex
    string-char = 
      | ...
      | unicodegraph-short 
      | unicodegraph-long 

Precedence.

The treatment of precedence follows that of OCaml. The one exception is that the expression "!x.y" parses as "!(x.y)" rather than "(!x).y". The OCaml grammar uses uppercase/lowercase distinctions to make disambiguations like the following at parse time:

      Ocaml: !A.b.C.d  == (!(A.b)).(C.d)
      Ocaml: !a.b.c.d  == (((!a).b).c).d
      F#:    !A.b.C.d  == !(A.b.C.d)
      F#:    !a.b.c.d  == !(a.b.c.d)

Note that in the first example '!' binds two elements of a long-identifier chain, and in the second it only binds one. Thus the parsing depends totally on the fact that 'A' is upper case and OCaml uses this fact to know that it represents a module name. F# deliberately allows values and module names to be both upper and lower case, and so F# cannot resolve the status of identifiers (i.e. whether an identifier is a module, value, constructor etc.) at parse-time, and instead does this when parsing long identifiers chains during typechecking (just as C# does). The above alteration means that parsing continues to remain independent on identifier status.

Numeric Literals.

The lexical specification of constants is as follows:

      int :=
         | digit+              -- e.g. 34
         | 0 (x|X) hexdigit+   -- e.g. 0x22
         | 0 (o|O) octaldigit+ -- e.g. 0o42
         | 0 (b|B) bitdigit+   -- e.g. 0b10010

      int8 := <int>y        -- e.g. 34y
      uint8 := <int>uy      -- e.g. 34uy
      int16 := <int>s       -- e.g. 34s
      uint16 := <int>us     -- e.g. 34us
      int32 := <int>l       -- e.g. 34l
      uint32 := <int>ul     -- e.g. 34ul
      nativeint := <int>n   -- e.g. 34n
      unativeint := <int>un -- e.g. 34un
      int64 := <int>L       -- e.g. 34L
      uint64 := <int>UL     -- e.g. 34UL
      ieee32 := <float>{F|f} -- e.g. 3.0F or 3.0f
      ieee64 := <float>     -- e.g. 3.0

      bytestring = <string>B
      bytechar = <bytechar>B

      float = 
        -? digit+ . digit*  
        -? digit+ (. digit* )? (e|E) (+|-)? digit+ 

Negative integers are currently specified using the approriate integer negation operator, e.g. "-3". This is under revision.

Where appropriate quotes have been used to indicate concrete syntax, if the symbol is also used in the specification of the grammar itself, e.g. '<' and '|'. Constructs with lower precedence are given first. The notation ... indicates repetition of the preceding non-terminal construct, with the optional repetition extending to surrounding delimiters e.g. <expr> ',' ... ',' <expr> means a sequence of one or more <expr>s separated by commas.

Basic Elements.

    ident := see above textual description of identifiers
    infix := see above textual description of operators
    prefix := see above textual description of operators
    string := see above textual description of strings
    char := see above textual description of chars

    longindent :=  <ident> '.' ... '.' <ident> 

Constants.

    const := 
      | <int>                         -- 32-bit signed integer
      | <int8> | <int16> | <int32> | <int64>    -- 8, 16, 32 and 64-bit signed integers
      | <uint8> | <uint16> | <int32> | <uint64> -- 8, 16, 32 and 64-bit unsigned integers
      | <ieee32>                      -- 32-bit 'Single' floating point number of type 'float32'
      | <ieee64>                      -- 64-bit 'Single' floating point number of type 'float32'
      | <char>                        -- Unicode character of type 'char'
      | <string>                      -- String of type 'string' (i.e. System.String)
      | <bytestring>                  -- String of type 'byte[]' 
      | <bytechar>                    -- Char of type 'byte'

Expressions, Patterns and Value Declarations.

    expr :=  
      |  <expr> ; <expr>               -- sequenced computations
      |  begin <expr> end              -- block expressions
      |  ( <expr> )                    -- block expressions
      |  ( <expr> : <type> )           -- type annotations
      |  let <val-decls> in <expr>     -- locally bind values
      |  let rec <val-decls> in <expr> -- locally bind mutually referential values
      |  function <rules>              -- a function value that executes the given pattern matching
      |  match <expr> with <rules>     -- match a value and execute the resulting target
      |  try <expr> with <rules>       -- execute an exit block if an exception is raised
      |  try <expr> finally <expr>     -- always execute an exit block
      |  if <expr> then <expr> else <expr> -- conditionals
      |  if <expr> then <expr>         -- conditional statements
      |  while <expr> do <expr> done   -- while loops
      |  for <ident> = <expr> to <expr> do <expr> done   -- for loops
      |  lazy <expr>                   -- delayed computations
      |  assert <expr>                 -- checked computations
      |  <expr> := <expr>              -- assignments to reference cells
      |  <expr> <- <expr>              -- property and field assignments
      |  <expr>.(<expr>)               -- operator '.()', defaults to array lookup
      |  <expr>.(<expr>) <- <expr>     -- operator '.()<-', defaults to array assignment
      |  <expr>.[<expr>]               -- operator '.[]', defaults to string lookup
      |  <expr> , ... , <expr>         -- tuple expressions
      |  { <field-exprs> }             -- record expressions 
      |  { <expr> with <field-exprs> } -- copy-and-update record expressions
      |  <expr> <infix> <expr>         -- infix expressions
      |  <prefix> <expr>               -- prefix expressions
      |  <expr> <expr>                 -- application/invocation 
      |  <expr> '.' <expr>             -- member access
      |  <ident>                       -- a value
      |  ()                            -- the 'unit' value
      |  [ <expr> ; ... ; <expr> ]     -- list expressions
      |  [| <expr> ; ... ; <expr> |]   -- array expressions
      |  false | true                  -- boolean constants
      |  <const>                       -- a constant value
      |  new <object-construction>     -- object expression
      |  { new <object-construction> with <val-decls>} -- object expression with overrides
      |  null                          -- the 'null' value for a .NET type
      |  ( <expr> :? <typ> )           -- dynamic type test
      |  ( <expr> :> <typ> )           -- static upcast coercion
      |  ( <expr> :?> <typ> )          -- dynamic downcast coercion
      |  upcast <expr>                 -- static upcast coercion to inferred type
      |  downcast <expr>               -- dynamic downcast coercion to inferred type

    val-decl := 
      | <pat> = <expr>                 -- bind the expression to the pattern 
      | do <expr>                      -- execute a statement as a binding

    field-expr :=
      |  <longident> = <expr>          -- specify a value for a field

    object-construction:=
      |  <type>(exprs)                 -- constructor call for an object expression for a class
      |  <type>                        -- an object expression for an interface
      |  <object-construction> as <ident>  -- name the 'base' object

    rule := <pat> {when <expr>} -> <expr>    -- pattern, optional guard and action

    pat := 
      | <const>                       -- constant pattern
      | <longident>                   -- variable binding, nullary constructor or named literal
      | <pat> as <ident>              -- name the matched expression
      | <pat> | <pat>                 -- 'or' patterns
      | <pat> :: <pat>                -- 'cons' patterns
      | [<pat> ; ... ; <pat>]         -- list patterns
      | (<pat>,...,<pat>)             -- tuple patterns
      | {<field-pat> ; ... ; <field-pat>} -- record patterns
      | :? <type>                     -- dynamic type test patterns
      | :? <type> as  <ident>         -- dynamic type test patterns, with named result
      | <longident> <pat>             -- Unary constructor
      | <longident> (<pats>)          -- N-ary constructor
      | _                             -- wildcard pattern
      | null                          -- null-test pattern

    field-pat := <longident> = <pat>

    exprs := <expr> ',' ... ',' <expr> 
    pats :=  <pat> , ... , <pat> 
    val-decls := <val-decl> and ... and <val-decl> 
    field-exprs := <field-expr> ; ... ; <field-expr> 
    field-pats := <field-pat> ; ... ; <field-pat> 
    rules := {'|'} <rule> '|' ... '|' <rule>  -- multiple rules

Types.

    type :=  
      |  <type> -> <type>              -- function type
      |  <type> * ... * <type>         -- tuple type
      |  ( <type> )                    -- parenthesized type
      |  <ident>. ... .<ident>         -- named type
      |  '<id>                         -- variable type
      |  <type> <id>. ... .<id>        -- constructed type, e.g 'int list'
      |  ( <types> ) <id>. ... .<id>   -- constructed type, e.g '(int,string) map'
      |  <id>. ... .'<'id><<type>'>'       -- alternative syntax for constructed types, e.g. list<int> 
      |  <type>[]                      -- .NET array type
      |  <type>[,]                     -- .NET two-dimensional array type
      |  <type> lazy                   -- lazy type
      |  _                             -- wildcard type, for use with type annotations
      |  '<ident>. <type>              -- first-class generic type, for record field types only

Type Definitions.

    type-decl := 
          ...

    exception-decls := 
          ...

    exception-decl := 
          ...

    type-decls := <type-decl> ; ... ; <type-decl> 

Interface Files (.mli or .fsi).

    intf-file-decl := 
      | val <ident>: <type>        -- value specifciations
      | type <type-decls>                -- type specifciations
      | exception <exception-decls>      -- exception specifciations

    intf-file-decls := <intf-file-decl> ... <intf-file-decl>

    interface-file :=
      | module <longident> <intf-file-decls> -- named module
      | <intf-file-decls>                -- anonymous module (name implicit from filename)

Note: types for value specifications are syntactically identical to types except the parenthethization has additional significance for typechecking and interoperability. See the advanced section of the manual for more details.

Implementation Files (.ml or .fs).

    impl-file-decl := 
      | let <val-decls>                  -- top level value definitions
      | let rec <val-decls>              -- top level mutually-referntial value definitions
      | type <type-decls>                -- type definitions
      | exception <exception-decls>      -- exception definitions
      | module <longident> = <longident> -- alias a module
      | open <longident>                 -- provide implicit access to the given module path
      | do <expr>                        -- execute the given expression

    impl-file-decls := <impl-file-decl> ... <impl-file-decl>

    implementation-file :=
      | module <longident> <impl-file-decls> -- named module
      | <impl-file-decls>                -- anonymous module (name implicit from filename)

Custom Attributes.

.NET custom metadata attributes can be added at several positions in the above grammar. These have been shown separately below for clarity. These are added to the corresponding compiled forms of the given constructs. These compiled forms are only defined for publicly accessible constructs such as publicly available top-level methods. Attributes placed on internal constructs may or may not appear in the compiled binary.

    val-decl := 
      | ...
      | <attributes> val-decl          -- attributes on methods and static field values

    field-decl :=
      | ...
      | <attributes> field-decl        -- attributes on field declarations

    type-decl :=
      | ...
      | <attributes> type-decl         -- attributes on type definitions

    impl-file-decl :=
      | ...
      | <attributes>                   -- attributes on assembly, module or main method

    attribute := <object-construction>
    attributes := [< <attribute> ; ...  ; <attribute> >]