RNN_StackOverFlow#

class fl_sim.models.RNN_StackOverFlow(vocab_size: int = 10000, num_oov_buckets: int = 1, embedding_size: int = 96, latent_size: int = 670, num_layers: int = 1)[source]#

Bases: Module, CLFMixin, SizeMixin, DiffMixin

Creates a RNN model using LSTM layers for StackOverFlow (next word prediction task).

This replicates the model structure in the paper [Reddi et al.[1]].

Modified from FedML.

Parameters:
  • vocab_size (int, default 10000) – The number of different words that can appear in the input.

  • num_oov_buckets (int, default 1) – The number of out-of-vocabulary buckets.

  • embedding_size (int, default 96) – The size of each embedding vector.

  • latent_size (int, default 670) – The number of features in the hidden state h.

  • num_layers (int, default 1) – The number of recurrent layers (torch.nn.LSTM).

References

forward(input_seq: Tensor, hidden_state: Tensor | None = None) Tensor[source]#

Forward pass.

Parameters:
  • input_seq (torch.Tensor) – Shape (batch_size, seq_len), dtype torch.long.

  • hidden_state (torch.Tensor, optional) – Shape (num_layers, batch_size, latent_size), dtype torch.float32.

Returns:

Shape (batch_size, extended_vocab_size, seq_len), dtype torch.float32.

Return type:

torch.Tensor

pipeline(truncated_sentence: str, word_to_id: Callable[[str], int] | Dict[str, int] | None = None, id_to_word: Callable[[int], str] | Dict[int, str] | None = None) str[source]#

Predict the next word given a truncated sentence.

Parameters:
  • truncated_sentence (str) – The truncated sentence.

  • word_to_id (Callable[[str], int] or Dict[str, int], optional) – A function that maps a word to its id.

  • id_to_word (Callable[[int], str] or Dict[int, str], optional) – A function that maps an id to its word.

Returns:

The predicted next word.

Return type:

str