367 lines
13 KiB
Markdown
367 lines
13 KiB
Markdown
# stringr 1.5.1
|
||
|
||
* Some minor documentation improvements.
|
||
|
||
* `str_trunc()` now correctly truncates strings when `side` is `"left"` or
|
||
`"center"` (@UchidaMizuki, #512).
|
||
|
||
# stringr 1.5.0
|
||
|
||
## Breaking changes
|
||
|
||
* stringr functions now consistently implement the tidyverse recycling rules
|
||
(#372). There are two main changes:
|
||
|
||
* Only vectors of length 1 are recycled. Previously, (e.g.)
|
||
`str_detect(letters, c("x", "y"))` worked, but it now errors.
|
||
|
||
* `str_c()` ignores `NULLs`, rather than treating them as length 0
|
||
vectors.
|
||
|
||
Additionally, many more arguments now throw errors, rather than warnings,
|
||
if supplied the wrong type of input.
|
||
|
||
* `regex()` and friends now generate class names with `stringr_` prefix (#384).
|
||
|
||
* `str_detect()`, `str_starts()`, `str_ends()` and `str_subset()` now error
|
||
when used with either an empty string (`""`) or a `boundary()`. These
|
||
operations didn't really make sense (`str_detect(x, "")` returned `TRUE`
|
||
for all non-empty strings) and made it easy to make mistakes when programming.
|
||
|
||
## New features
|
||
|
||
* Many tweaks to the documentation to make it more useful and consistent.
|
||
|
||
* New `vignette("from-base")` by @sastoudt provides a comprehensive comparison
|
||
between base R functions and their stringr equivalents. It's designed to
|
||
help you move to stringr if you're already familiar with base R string
|
||
functions (#266).
|
||
|
||
* New `str_escape()` escapes regular expression metacharacters, providing
|
||
an alternative to `fixed()` if you want to compose a pattern from user
|
||
supplied strings (#408).
|
||
|
||
* New `str_equal()` compares two character vectors using unicode rules,
|
||
optionally ignoring case (#381).
|
||
|
||
* `str_extract()` can now optionally extract a capturing group instead of
|
||
the complete match (#420).
|
||
|
||
* New `str_flatten_comma()` is a special case of `str_flatten()` designed for
|
||
comma separated flattening and can correctly apply the Oxford commas
|
||
when there are only two elements (#444).
|
||
|
||
* New `str_split_1()` is tailored for the special case of splitting up a single
|
||
string (#409).
|
||
|
||
* New `str_split_i()` extract a single piece from a string (#278, @bfgray3).
|
||
|
||
* New `str_like()` allows the use of SQL wildcards (#280, @rjpat).
|
||
|
||
* New `str_rank()` to complete the set of order/rank/sort functions (#353).
|
||
|
||
* New `str_sub_all()` to extract multiple substrings from each string.
|
||
|
||
* New `str_unique()` is a wrapper around `stri_unique()` and returns unique
|
||
string values in a character vector (#249, @seasmith).
|
||
|
||
* `str_view()` uses ANSI colouring rather than an HTML widget (#370). This
|
||
works in more places and requires fewer dependencies. It includes a number
|
||
of other small improvements:
|
||
|
||
* It no longer requires a pattern so you can use it to display strings with
|
||
special characters.
|
||
* It highlights unusual whitespace characters.
|
||
* It's vectorised over both string` and `pattern` (#407).
|
||
* It defaults to displaying all matches, making `str_view_all()` redundant
|
||
(and hence deprecated) (#455).
|
||
|
||
* New `str_width()` returns the display width of a string (#380).
|
||
|
||
* stringr is now licensed as MIT (#351).
|
||
|
||
## Minor improvements and bug fixes
|
||
|
||
* Better error message if you supply a non-string pattern (#378).
|
||
|
||
* A new data source for `sentences` has fixed many small errors.
|
||
|
||
* `str_extract()` and `str_exctract_all()` now work correctly when `pattern`
|
||
is a `boundary()`.
|
||
|
||
* `str_flatten()` gains a `last` argument that optionally override the
|
||
final separator (#377). It gains a `na.rm` argument to remove missing
|
||
values (since it's a summary function) (#439).
|
||
|
||
* `str_pad()` gains `use_width` argument to control whether to use the total
|
||
code point width or the number of code points as "width" of a string (#190).
|
||
|
||
* `str_replace()` and `str_replace_all()` can use standard tidyverse formula
|
||
shorthand for `replacement` function (#331).
|
||
|
||
* `str_starts()` and `str_ends()` now correctly respect regex operator
|
||
precedence (@carlganz).
|
||
|
||
* `str_wrap()` breaks only at whitespace by default; set
|
||
`whitespace_only = FALSE` to return to the previous behaviour (#335, @rjpat).
|
||
|
||
* `word()` now returns all the sentence when using a negative `start` parameter
|
||
that is greater or equal than the number of words. (@pdelboca, #245)
|
||
|
||
# stringr 1.4.1
|
||
|
||
Hot patch release to resolve R CMD check failures.
|
||
|
||
# stringr 1.4.0
|
||
|
||
* `str_interp()` now renders lists consistently independent on the presence of
|
||
additional placeholders (@amhrasmussen).
|
||
|
||
* New `str_starts()` and `str_ends()` functions to detect patterns at the
|
||
beginning or end of strings (@jonthegeek, #258).
|
||
|
||
* `str_subset()`, `str_detect()`, and `str_which()` get `negate` argument,
|
||
which is useful when you want the elements that do NOT match (#259,
|
||
@yutannihilation).
|
||
|
||
* New `str_to_sentence()` function to capitalize with sentence case
|
||
(@jonthegeek, #202).
|
||
|
||
# stringr 1.3.1
|
||
|
||
* `str_replace_all()` with a named vector now respects modifier functions (#207)
|
||
|
||
* `str_trunc()` is once again vectorised correctly (#203, @austin3dickey).
|
||
|
||
* `str_view()` handles `NA` values more gracefully (#217). I've also
|
||
tweaked the sizing policy so hopefully it should work better in notebooks,
|
||
while preserving the existing behaviour in knit documents (#232).
|
||
|
||
# stringr 1.3.0
|
||
|
||
## API changes
|
||
|
||
* During package build, you may see
|
||
`Error : object ‘ignore.case’ is not exported by 'namespace:stringr'`.
|
||
This is because the long deprecated `str_join()`, `ignore.case()` and
|
||
`perl()` have now been removed.
|
||
|
||
## New features
|
||
|
||
* `str_glue()` and `str_glue_data()` provide convenient wrappers around
|
||
`glue` and `glue_data()` from the [glue](https://glue.tidyverse.org/) package
|
||
(#157).
|
||
|
||
* `str_flatten()` is a wrapper around `stri_flatten()` and clearly
|
||
conveys flattening a character vector into a single string (#186).
|
||
|
||
* `str_remove()` and `str_remove_all()` functions. These wrap
|
||
`str_replace()` and `str_replace_all()` to remove patterns from strings.
|
||
(@Shians, #178)
|
||
|
||
* `str_squish()` removes spaces from both the left and right side of strings,
|
||
and also converts multiple space (or space-like characters) to a single
|
||
space within strings (@stephlocke, #197).
|
||
|
||
* `str_sub()` gains `omit_na` argument for ignoring `NA`. Accordingly,
|
||
`str_replace()` now ignores `NA`s and keeps the original strings.
|
||
(@yutannihilation, #164)
|
||
|
||
## Bug fixes and minor improvements
|
||
|
||
* `str_trunc()` now preserves NAs (@ClaytonJY, #162)
|
||
|
||
* `str_trunc()` now throws an error when `width` is shorter than `ellipsis`
|
||
(@ClaytonJY, #163).
|
||
|
||
* Long deprecated `str_join()`, `ignore.case()` and `perl()` have now been
|
||
removed.
|
||
|
||
# stringr 1.2.0
|
||
|
||
## API changes
|
||
|
||
* `str_match_all()` now returns NA if an optional group doesn't match
|
||
(previously it returned ""). This is more consistent with `str_match()`
|
||
and other match failures (#134).
|
||
|
||
## New features
|
||
|
||
* In `str_replace()`, `replacement` can now be a function that is called once
|
||
for each match and whose return value is used to replace the match.
|
||
|
||
* New `str_which()` mimics `grep()` (#129).
|
||
|
||
* A new vignette (`vignette("regular-expressions")`) describes the
|
||
details of the regular expressions supported by stringr.
|
||
The main vignette (`vignette("stringr")`) has been updated to
|
||
give a high-level overview of the package.
|
||
|
||
## Minor improvements and bug fixes
|
||
|
||
* `str_order()` and `str_sort()` gain explicit `numeric` argument for sorting
|
||
mixed numbers and strings.
|
||
|
||
* `str_replace_all()` now throws an error if `replacement` is not a character
|
||
vector. If `replacement` is `NA_character_` it replaces the complete string
|
||
with replaces with `NA` (#124).
|
||
|
||
* All functions that take a locale (e.g. `str_to_lower()` and `str_sort()`)
|
||
default to "en" (English) to ensure that the default is consistent across
|
||
platforms.
|
||
|
||
# stringr 1.1.0
|
||
|
||
* Add sample datasets: `fruit`, `words` and `sentences`.
|
||
|
||
* `fixed()`, `regex()`, and `coll()` now throw an error if you use them with
|
||
anything other than a plain string (#60). I've clarified that the replacement
|
||
for `perl()` is `regex()` not `regexp()` (#61). `boundary()` has improved
|
||
defaults when splitting on non-word boundaries (#58, @lmullen).
|
||
|
||
* `str_detect()` now can detect boundaries (by checking for a `str_count()` > 0)
|
||
(#120). `str_subset()` works similarly.
|
||
|
||
* `str_extract()` and `str_extract_all()` now work with `boundary()`. This is
|
||
particularly useful if you want to extract logical constructs like words
|
||
or sentences. `str_extract_all()` respects the `simplify` argument
|
||
when used with `fixed()` matches.
|
||
|
||
* `str_subset()` now respects custom options for `fixed()` patterns
|
||
(#79, @gagolews).
|
||
|
||
* `str_replace()` and `str_replace_all()` now behave correctly when a
|
||
replacement string contains `$`s, `\\\\1`, etc. (#83, #99).
|
||
|
||
* `str_split()` gains a `simplify` argument to match `str_extract_all()`
|
||
etc.
|
||
|
||
* `str_view()` and `str_view_all()` create HTML widgets that display regular
|
||
expression matches (#96).
|
||
|
||
* `word()` returns `NA` for indexes greater than number of words (#112).
|
||
|
||
# stringr 1.0.0
|
||
|
||
* stringr is now powered by [stringi](https://github.com/gagolews/stringi)
|
||
instead of base R regular expressions. This improves unicode and support, and
|
||
makes most operations considerably faster. If you find stringr inadequate for
|
||
your string processing needs, I highly recommend looking at stringi in more
|
||
detail.
|
||
|
||
* stringr gains a vignette, currently a straight forward update of the article
|
||
that appeared in the R Journal.
|
||
|
||
* `str_c()` now returns a zero length vector if any of its inputs are
|
||
zero length vectors. This is consistent with all other functions, and
|
||
standard R recycling rules. Similarly, using `str_c("x", NA)` now
|
||
yields `NA`. If you want `"xNA"`, use `str_replace_na()` on the inputs.
|
||
|
||
* `str_replace_all()` gains a convenient syntax for applying multiple pairs of
|
||
pattern and replacement to the same vector:
|
||
|
||
```R
|
||
input <- c("abc", "def")
|
||
str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
|
||
```
|
||
|
||
* `str_match()` now returns NA if an optional group doesn't match
|
||
(previously it returned ""). This is more consistent with `str_extract()`
|
||
and other match failures.
|
||
|
||
* New `str_subset()` keeps values that match a pattern. It's a convenient
|
||
wrapper for `x[str_detect(x)]` (#21, @jiho).
|
||
|
||
* New `str_order()` and `str_sort()` allow you to sort and order strings
|
||
in a specified locale.
|
||
|
||
* New `str_conv()` to convert strings from specified encoding to UTF-8.
|
||
|
||
* New modifier `boundary()` allows you to count, locate and split by
|
||
character, word, line and sentence boundaries.
|
||
|
||
* The documentation got a lot of love, and very similar functions (e.g.
|
||
first and all variants) are now documented together. This should hopefully
|
||
make it easier to locate the function you need.
|
||
|
||
* `ignore.case(x)` has been deprecated in favour of
|
||
`fixed|regex|coll(x, ignore.case = TRUE)`, `perl(x)` has been deprecated in
|
||
favour of `regex(x)`.
|
||
|
||
* `str_join()` is deprecated, please use `str_c()` instead.
|
||
|
||
# stringr 0.6.2
|
||
|
||
* fixed path in `str_wrap` example so works for more R installations.
|
||
|
||
* remove dependency on plyr
|
||
|
||
# stringr 0.6.1
|
||
|
||
* Zero input to `str_split_fixed` returns 0 row matrix with `n` columns
|
||
|
||
* Export `str_join`
|
||
|
||
# stringr 0.6
|
||
|
||
* new modifier `perl` that switches to Perl regular expressions
|
||
|
||
* `str_match` now uses new base function `regmatches` to extract matches -
|
||
this should hopefully be faster than my previous pure R algorithm
|
||
|
||
# stringr 0.5
|
||
|
||
* new `str_wrap` function which gives `strwrap` output in a more convenient
|
||
format
|
||
|
||
* new `word` function extract words from a string given user defined
|
||
separator (thanks to suggestion by David Cooper)
|
||
|
||
* `str_locate` now returns consistent type when matching empty string (thanks
|
||
to Stavros Macrakis)
|
||
|
||
* new `str_count` counts number of matches in a string.
|
||
|
||
* `str_pad` and `str_trim` receive performance tweaks - for large vectors this
|
||
should give at least a two order of magnitude speed up
|
||
|
||
* str_length returns NA for invalid multibyte strings
|
||
|
||
* fix small bug in internal `recyclable` function
|
||
|
||
# stringr 0.4
|
||
|
||
* all functions now vectorised with respect to string, pattern (and
|
||
where appropriate) replacement parameters
|
||
* fixed() function now tells stringr functions to use fixed matching, rather
|
||
than escaping the regular expression. Should improve performance for
|
||
large vectors.
|
||
* new ignore.case() modifier tells stringr functions to ignore case of
|
||
pattern.
|
||
* str_replace renamed to str_replace_all and new str_replace function added.
|
||
This makes str_replace consistent with all functions.
|
||
* new str_sub<- function (analogous to substring<-) for substring replacement
|
||
* str_sub now understands negative positions as a position from the end of
|
||
the string. -1 replaces Inf as indicator for string end.
|
||
* str_pad side argument can be left, right, or both (instead of center)
|
||
* str_trim gains side argument to better match str_pad
|
||
* stringr now has a namespace and imports plyr (rather than requiring it)
|
||
|
||
# stringr 0.3
|
||
|
||
* fixed() now also escapes |
|
||
* str_join() renamed to str_c()
|
||
* all functions more carefully check input and return informative error
|
||
messages if not as expected.
|
||
* add invert_match() function to convert a matrix of location of matches to
|
||
locations of non-matches
|
||
* add fixed() function to allow matching of fixed strings.
|
||
|
||
# stringr 0.2
|
||
|
||
* str_length now returns correct results when used with factors
|
||
* str_sub now correctly replaces Inf in end argument with length of string
|
||
* new function str_split_fixed returns fixed number of splits in a character
|
||
matrix
|
||
* str_split no longer uses strsplit to preserve trailing breaks
|