API Reference · Sentencize.jl

Sentencize.SUPPORTED_LANG
Sentencize.Prefixes
Sentencize.split_sentence

Sentencize.SUPPORTED_LANG — Constant

SUPPORTED_LANG

Supported languages.

Contains:

"ca"
"cs"
"da"
"de"
"el"
"en"
"es"
"fi"
"fr"
"hu"
"is"
"it"
"lt"
"lv"
"nl"
"no"
"pl"
"pt"
"ro"
"ru"
"sk"
"sl"
"sv"
"tr"

source

Sentencize.Prefixes — Type

Prefixes(prefixes::Dict{T,PrefixType}=Dict{String,PrefixType}(); prefix_file::Union{String,Nothing}=nothing, lang::Union{String,Nothing}="en") where {T<:AbstractString}

Constructs Prefixes.

Arguments

prefixes::Dict{<:AbstractString,PrefixType}=Dict{T,PrefixType}(): Optional. A dictionary of non-breaking prefixes.
prefix_file::Union{String,Nothing}=nothing: Optional. A path to a file containing non-breaking prefixes to add to provided prefixes.
lang::AbstractString="en": Optional. The language of the non-breaking prefixes (see ?SUPPORTED_LANG for available languages) to be added to prefixes.

source

Sentencize.split_sentence — Method

split_sentence(text::AbstractString; prefixes::Dict{<:AbstractString,PrefixType}=Dict{String,PrefixType}(), prefix_file::Union{String,Nothing}=nothing, lang::Union{String,Nothing}="en")

Splits a text into sentences.

Arguments

text::AbstractString: The text to split into sentences.
prefixes::Dict{<:AbstractString,PrefixType}: Optional. A dictionary of non-breaking prefixes.
prefix_file::Union{String,Nothing}: Optional. A path to a file containing non-breaking prefixes to add to provided prefixes.
lang::Union{String,Nothing}: Optional. The language of the non-breaking prefixes (see ?SUPPORTED_LANG for available languages) to be added to prefixes Default is "en" (=English).

Examples

split_sentence("This is a paragraph. It contains several sentences. "But why," you ask?")
# Output: ["This is a paragraph.", "It contains several sentences.", ""But why," you ask?"]

source