Bug in averaging

Unless the `--no-averaging` flag is specified, the learner will attempt to use the averaged perceptron for training. However, [the averaging formula](https://github.com/nschneid/arabic-tagger/blob/master/src/edu/cmu/ark/DiscriminativeTagger.java#L785) is incorrect when averages are only computed for updated weights (and averaging the entire weight vector would be too slow).

A correct approach can be found in p. 19 of [Hal Daumé's thesis](http://hal3.name/docs/daume06thesis.pdf). This keeps two vectors: the current weights, and the weights' deviation from the sum over all learning timesteps. Both vectors are sparsely updated, and allow the averaged vector to be computed at the end of learning.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug in averaging #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bug in averaging #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions