Gedit regular expressions plugin: Difference between revisions

From WickyWiki
No edit summary
No edit summary
Line 13: Line 13:


== Other gedit plugins ==
== Other gedit plugins ==
* http://live.gnome.org/Gedit/Plugins
* http://live.gnome.org/Gedit/Plugins


== Some regular expressions ==
== Regular expressions overview ==


{|style="text-align:left;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=600
{|style="text-align:left;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=600
Line 51: Line 52:
=== Replace \r\n with \n ===
=== Replace \r\n with \n ===


Unix style end-of-line.  
To enforce Unix style end-of-line.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 68: Line 69:


=== Remove trailing white-spaces ===
=== Remove trailing white-spaces ===
To further formalize the document.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 76: Line 79:


=== Remove EOL: trailing and leading non-capital letter ===
=== Remove EOL: trailing and leading non-capital letter ===
To remove fixed line length without removing paragraphs.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 84: Line 89:


=== Remove EOL: leading non-capital letter ===
=== Remove EOL: leading non-capital letter ===
To remove fixed line length without removing paragraphs.


Note: an underscore (_) is used here to signify a space.
Note: an underscore (_) is used here to signify a space.
Line 94: Line 101:


=== Remove EOL: trailing non-capital letter===
=== Remove EOL: trailing non-capital letter===
To remove fixed line length without removing paragraphs.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 102: Line 111:


=== Remove hyphenation '-' from words ===
=== Remove hyphenation '-' from words ===
To remove hyphenation after removing fixed line length.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 110: Line 121:


=== Split word with capital letter in the middle ===
=== Split word with capital letter in the middle ===
To correct OCR problem.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 119: Line 132:


=== Replace 1 (one) in a non-number with I ===
=== Replace 1 (one) in a non-number with I ===
To correct OCR problem.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
Line 127: Line 142:


=== Search number > 9 ===
=== Search number > 9 ===
To find page numbers.


{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400
{|style="text-align:center;background-color:#ffffdd;" cellpadding=5 cellspacing=0 border=1 width=400

Revision as of 15:01, 9 December 2012

Install

  1. download the cortrect plugin from https://bitbucket.org/brandizzi/gedit-re-search/wiki/Home
  2. extract the contents to gedit plugin directory, typically:
    • ~/.gnome2/gedit/plugins (gedit2)
    • ~/.local/share/gedit/plugins (gedit3)
  3. restart gedit
  4. in the menu: Edit -> Preferences -> plugins -> enable 'RegEx Search and Replace'
  5. now you should have a 'Regular Expression..' -item in the Search menu.

Other gedit plugins

Regular expressions overview

Expression Matches
\t tab
\r carriage return (CR)
\n newline (LF)
. any character
[1234abcd] any of the specified characters
[^1234abcd] none of the specified characters
[0-9a-zA-Z] any of the characters within the specified ranges
expr* 'expr' repeats 0 to multiple times
expr+ 'expr' repeats 1 to multiple times
expr{n,m} 'expr' repeats n to m times
(expr) use 'expr' in the replacement with \1 \2 \3 etc
^ start of line
$ end of line

Examples

Replace \r\n with \n

To enforce Unix style end-of-line.

Search
\r

Replace:

Search Replace with
\r\n \n

Remove trailing white-spaces

To further formalize the document.

Search Replace with
[ \t]{1,9}\n \n

Remove EOL: trailing and leading non-capital letter

To remove fixed line length without removing paragraphs.

Search Replace with
([a-z,;:])\n{1,9}([a-z]) \1 \2

Remove EOL: leading non-capital letter

To remove fixed line length without removing paragraphs.

Note: an underscore (_) is used here to signify a space.

Search Replace with
\n{1,9}([a-z]) _\1

Remove EOL: trailing non-capital letter

To remove fixed line length without removing paragraphs.

Search Replace with
([a-z,;:])\n{1,9} \1_

Remove hyphenation '-' from words

To remove hyphenation after removing fixed line length.

Search Replace with
([a-z])-\n{1,9}([a-z]) \1\2

Split word with capital letter in the middle

To correct OCR problem.

Search Replace with
([a-z,.])([A-Z]) \1 \2

Replace 1 (one) in a non-number with I

To correct OCR problem.

Search Replace with
[^0-9]1[^0-9] I

Search number > 9

To find page numbers.

Search
[0-9]{2,9}

Regular expressions in Office Libre

You can also use regular expressions in Office Libre. Note that variables are noted with a '$'.

Remove paragraph trailing spaces

Search Replace with
^(.*) {1,9}$ $1