Gedit regular expressions plugin: Difference between revisions

From WickyWiki
No edit summary
 
(20 intermediate revisions by 2 users not shown)
Line 4: Line 4:
== Install ==
== Install ==


# download the plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
'''NOTE''': Gedit now has native regular expression search (>= 3.18). Select Find-and-replace from the menu and you can enable regular expressions there.
# extract the contents to '''~/.gnome2/gedit/plugins/''' in such a way that you have this file in this location:
 
#*   ~/.gnome2/gedit/plugins/regex_replace.gedit-plugin
# download the correct plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
# extract the contents to gedit plugin directory, typically:
#* ~/.gnome2/gedit/plugins (gedit2)
#* ~/.local/share/gedit/plugins (gedit3)
# restart gedit
# restart gedit
# in the menu: Edit -> Preferences -> plugins -> enable Expression Replace
# in the menu: Edit -> Preferences -> plugins -> enable 'RegEx Search and Replace'
# now you should have a "Regular Expression" item in the Search menu.
# now you should have a 'Regular Expression..' -item in the Search menu.


== Other gedit plugins ==
== Other gedit plugins ==
* http://live.gnome.org/Gedit/Plugins
* http://live.gnome.org/Gedit/Plugins
== Regular expressions overview ==
{|style="text-align:left;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=600
! width=20% | Expression !! Matches
|-
| \t || tab
|-
| \r || carriage return (CR)
|-
| \n || newline (LF)
|-
| . || any character
|-
| [1234abcd] || any of the specified characters
|-
| [^1234abcd] || none of the specified characters
|-
| [0-9a-zA-Z] || any of the characters within the specified ranges
|-
| expr* || 'expr' repeats 0 to multiple times
|-
| expr+ || 'expr' repeats 1 to multiple times
|-
| expr{n,m} || 'expr' repeats n to m times
|-
| (expr) || use 'expr' in the replacement with \1 \2 \3 etc
|-
| ^ || start of line
|-
| $ || end of line
|}


== Examples ==
== Examples ==


Note: where needed an underscore (_) is used to denote a space.
=== Replace \r\n with \n ===


=== Replace /r/n with /n ===
To enforce Unix style end-of-line (EOL).
 
Find-Replace:


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 26: Line 64:
|}
|}


Check:
=== Remove trailing white-spaces ===
 
To further formalize the document.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! Search
! width=50% | Search !! Replace with
|-
|-
| \r
| [ \t]{1,99}\n || \n
|}
|}


=== Remove trailing white-spaces ===
=== Remove EOL 1: trailing and leading non-capital letter ===
 
To remove fixed line length without removing paragraphs.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| [ \t]{1,9}\n || \n
| ([a-z,;:])\n{1,9}([a-z]) || \1 \2
|}
|}


=== Split word with capital letter in the middle ===


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| ([a-z,.])([A-Z])
| ([^"'.,>!?=])\n([a-z"]) || \1 \2
| \1 \2
|}
|}


=== Replace 1 (one) in a non-number with I ===
=== Remove EOL 2: leading non-capital letter ===
 
To remove fixed line length without removing paragraphs.
 
Note: an underscore (_) is used here to signify a space.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| [^0-9]1[^0-9] || I
| \n{1,9}([a-z]) || _\1
|}
|}


=== Search number > 9 ===
=== Remove EOL 3: trailing non-capital letter===
 
To remove fixed line length without removing paragraphs.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search
! width=50% | Search !! Replace with
|-
|-
| [0-9]{2,9}
| ([a-z,;:])\n{1,9} || \1_
|}
|}


=== Remove hyphenation '-' from words ===
=== Remove hyphenation '-' from words ===
To remove hyphenation after removing fixed line length.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 75: Line 123:
|}
|}


=== Remove EOL: trailing and leading non-capital letter ===
=== Split word with capital letter in the middle ===
 
To correct OCR problem: missing space.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| ([a-z,;:])\n{1,9}([a-z]) || \1 \2
| ([a-z,.])([A-Z])
| \1 \2
|}
|}


=== Remove EOL: leading non-capital letter ===
=== Replace 1 (one) in a non-number with I ===
 
To correct OCR problem: I in number.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| \n{1,9}([a-z]) || _\1
| ([^0-9])1([^0-9]) || \1I\2
|}
|}


=== Remove EOL: trailing non-capital letter===
=== Search number > 9 ===
 
To find page numbers.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search
|-
|-
| ([a-z,;:])\n{1,9} || \1_
| [0-9]{2,9}
|}
|}



Latest revision as of 18:52, 11 July 2019

Install

NOTE: Gedit now has native regular expression search (>= 3.18). Select Find-and-replace from the menu and you can enable regular expressions there.

  1. download the correct plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
  2. extract the contents to gedit plugin directory, typically:
    • ~/.gnome2/gedit/plugins (gedit2)
    • ~/.local/share/gedit/plugins (gedit3)
  3. restart gedit
  4. in the menu: Edit -> Preferences -> plugins -> enable 'RegEx Search and Replace'
  5. now you should have a 'Regular Expression..' -item in the Search menu.

Other gedit plugins

Regular expressions overview

Expression Matches
\t tab
\r carriage return (CR)
\n newline (LF)
. any character
[1234abcd] any of the specified characters
[^1234abcd] none of the specified characters
[0-9a-zA-Z] any of the characters within the specified ranges
expr* 'expr' repeats 0 to multiple times
expr+ 'expr' repeats 1 to multiple times
expr{n,m} 'expr' repeats n to m times
(expr) use 'expr' in the replacement with \1 \2 \3 etc
^ start of line
$ end of line

Examples

Replace \r\n with \n

To enforce Unix style end-of-line (EOL).

Find-Replace:

Search Replace with
\r\n \n

Remove trailing white-spaces

To further formalize the document.

Search Replace with
[ \t]{1,99}\n \n

Remove EOL 1: trailing and leading non-capital letter

To remove fixed line length without removing paragraphs.

Search Replace with
([a-z,;:])\n{1,9}([a-z]) \1 \2


Search Replace with
([^"'.,>!?=])\n([a-z"]) \1 \2

Remove EOL 2: leading non-capital letter

To remove fixed line length without removing paragraphs.

Note: an underscore (_) is used here to signify a space.

Search Replace with
\n{1,9}([a-z]) _\1

Remove EOL 3: trailing non-capital letter

To remove fixed line length without removing paragraphs.

Search Replace with
([a-z,;:])\n{1,9} \1_

Remove hyphenation '-' from words

To remove hyphenation after removing fixed line length.

Search Replace with
([a-z])-\n{1,9}([a-z]) \1\2

Split word with capital letter in the middle

To correct OCR problem: missing space.

Search Replace with
([a-z,.])([A-Z]) \1 \2

Replace 1 (one) in a non-number with I

To correct OCR problem: I in number.

Search Replace with
([^0-9])1([^0-9]) \1I\2

Search number > 9

To find page numbers.

Search
[0-9]{2,9}

Regular expressions in Office Libre

You can also use regular expressions in Office Libre. Note that variables are noted with a '$'.

Remove paragraph trailing spaces

Search Replace with
^(.*) {1,9}$ $1