Gedit regular expressions plugin: Difference between revisions

From WickyWiki
No edit summary
 
(12 intermediate revisions by 2 users not shown)
Line 4: Line 4:
== Install ==
== Install ==


# download the cortrect plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
'''NOTE''': Gedit now has native regular expression search (>= 3.18). Select Find-and-replace from the menu and you can enable regular expressions there.
 
# download the correct plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
# extract the contents to gedit plugin directory, typically:
# extract the contents to gedit plugin directory, typically:
#* ~/.gnome2/gedit/plugins (gedit2)
#* ~/.gnome2/gedit/plugins (gedit2)
Line 13: Line 15:


== Other gedit plugins ==
== Other gedit plugins ==
* http://live.gnome.org/Gedit/Plugins
* http://live.gnome.org/Gedit/Plugins


== Some regular expressions ==
== Regular expressions overview ==


{|style="text-align:left;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=600
{|style="text-align:left;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=600
Line 51: Line 54:
=== Replace \r\n with \n ===
=== Replace \r\n with \n ===


Unix style end-of-line.  
To enforce Unix style end-of-line (EOL).
 
Find-Replace:


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! Search
! width=50% | Search !! Replace with
|-
|-
| \r  
| \r\n || \n
|}
|}


Replace:
=== Remove trailing white-spaces ===
 
To further formalize the document.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| \r\n || \n
| [ \t]{1,99}\n || \n
|}
|}


=== Remove trailing white-spaces ===
=== Remove EOL 1: trailing and leading non-capital letter ===
 
To remove fixed line length without removing paragraphs.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| [ \t]{1,9}\n || \n
| ([a-z,;:])\n{1,9}([a-z]) || \1 \2
|}
|}


=== Remove EOL: trailing and leading non-capital letter ===


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| ([a-z,;:])\n{1,9}([a-z]) || \1 \2
| ([^"'.,>!?=])\n([a-z"]) || \1 \2
|}
|}


=== Remove EOL: leading non-capital letter ===
=== Remove EOL 2: leading non-capital letter ===
 
To remove fixed line length without removing paragraphs.


Note: an underscore (_) is used here to signify a space.
Note: an underscore (_) is used here to signify a space.
Line 93: Line 103:
|}
|}


=== Remove EOL: trailing non-capital letter===
=== Remove EOL 3: trailing non-capital letter===
 
To remove fixed line length without removing paragraphs.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 102: Line 114:


=== Remove hyphenation '-' from words ===
=== Remove hyphenation '-' from words ===
To remove hyphenation after removing fixed line length.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 110: Line 124:


=== Split word with capital letter in the middle ===
=== Split word with capital letter in the middle ===
To correct OCR problem: missing space.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 119: Line 135:


=== Replace 1 (one) in a non-number with I ===
=== Replace 1 (one) in a non-number with I ===
To correct OCR problem: I in number.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
! width=50% | Search !! Replace with
! width=50% | Search !! Replace with
|-
|-
| [^0-9]1[^0-9] || I
| ([^0-9])1([^0-9]) || \1I\2
|}
|}


=== Search number > 9 ===
=== Search number > 9 ===
To find page numbers.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400

Latest revision as of 18:52, 11 July 2019

Install

NOTE: Gedit now has native regular expression search (>= 3.18). Select Find-and-replace from the menu and you can enable regular expressions there.

  1. download the correct plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
  2. extract the contents to gedit plugin directory, typically:
    • ~/.gnome2/gedit/plugins (gedit2)
    • ~/.local/share/gedit/plugins (gedit3)
  3. restart gedit
  4. in the menu: Edit -> Preferences -> plugins -> enable 'RegEx Search and Replace'
  5. now you should have a 'Regular Expression..' -item in the Search menu.

Other gedit plugins

Regular expressions overview

Expression Matches
\t tab
\r carriage return (CR)
\n newline (LF)
. any character
[1234abcd] any of the specified characters
[^1234abcd] none of the specified characters
[0-9a-zA-Z] any of the characters within the specified ranges
expr* 'expr' repeats 0 to multiple times
expr+ 'expr' repeats 1 to multiple times
expr{n,m} 'expr' repeats n to m times
(expr) use 'expr' in the replacement with \1 \2 \3 etc
^ start of line
$ end of line

Examples

Replace \r\n with \n

To enforce Unix style end-of-line (EOL).

Find-Replace:

Search Replace with
\r\n \n

Remove trailing white-spaces

To further formalize the document.

Search Replace with
[ \t]{1,99}\n \n

Remove EOL 1: trailing and leading non-capital letter

To remove fixed line length without removing paragraphs.

Search Replace with
([a-z,;:])\n{1,9}([a-z]) \1 \2


Search Replace with
([^"'.,>!?=])\n([a-z"]) \1 \2

Remove EOL 2: leading non-capital letter

To remove fixed line length without removing paragraphs.

Note: an underscore (_) is used here to signify a space.

Search Replace with
\n{1,9}([a-z]) _\1

Remove EOL 3: trailing non-capital letter

To remove fixed line length without removing paragraphs.

Search Replace with
([a-z,;:])\n{1,9} \1_

Remove hyphenation '-' from words

To remove hyphenation after removing fixed line length.

Search Replace with
([a-z])-\n{1,9}([a-z]) \1\2

Split word with capital letter in the middle

To correct OCR problem: missing space.

Search Replace with
([a-z,.])([A-Z]) \1 \2

Replace 1 (one) in a non-number with I

To correct OCR problem: I in number.

Search Replace with
([^0-9])1([^0-9]) \1I\2

Search number > 9

To find page numbers.

Search
[0-9]{2,9}

Regular expressions in Office Libre

You can also use regular expressions in Office Libre. Note that variables are noted with a '$'.

Remove paragraph trailing spaces

Search Replace with
^(.*) {1,9}$ $1