The Tech Diary

Your Online Notebook!

Tech Diary

Journal your struggles and achievements with technology.

Minimal Matching in Vim

Posted by: Brad Waite

Tagged in: Untagged 

One of the first things anyone who tackles regular expressions (regex) learns is the ".*" construct, which matches any character any number of times.  What is sometimes overlooked is that the * is greedy, in that it grabs any character as many times as it can.  And while that's frequently useful, there are times where you want it to grab it as few times as it can.  This is called "minimal matching".  For those familiar with Perl's minimal matching operator, '?', it turns out Vim has one, too!  Read on to find out how to use it.

Say you're trying to strip HTML tags from a document in Vim.  You might have multiple HTML tags on a single line:

Here's the Foo!  It's the best!

If you used a replace regex like this:

:%s/<.*>//g

it would produce:

Here's !  It's the best!

That's obviously now what you want.  What we really want is for the regex engine to grab as much as it can until it reaches the first '>'.

Vim's minimal matching operators is '\{-}'.  If it looks odd, consider that it's actually a form of the '\{n,m}' specific count operator.  'ab\{1,3}' would match "abbb", "abbbb" and "abbbbb".  When n is negative, Vim matches the minimum it can, so 'ab\{-1,3}' will match "ab" in "abbb". It turns out in this case that n and m are not necessary and '\{-}' matches the previous item zero or more times, as few as possible.

Going back to our HTML stripping example, we can use the following regex:

:%s/<.\{-}>//g

Using our previous example source text, that would produce following text:

Here's the Foo!  It's the best!

Now that's more like it.

What if you want to remove all opening and closing

tags, even if they had additional style information?

 

Minimal Matching

The following regex will do exactly what you want, all in one step:

:%s///g

split() function in mysql

Posted by: Brad Waite

Tagged in: Untagged 

The following function definition creates a split() function that splits a string by a given delimiter and returns the specified element.

DROP FUNCTION IF EXISTS split;
CREATE FUNCTION split (str TEXT, delim VARCHAR(1), N INT)
RETURNS TEXT DETERMINISTIC
RETURN SUBSTRING(SUBSTRING_INDEX(str, delim, N), LENGTH(SUBSTRING_INDEX(str, delim, N - 1)) + LENGTH(delim) + (N > 1))

The syntax of the function is as follows.

split(str, separator, pos)

Counting characters in vim

Posted by: Brad Waite

Tagged in: Untagged 

There's several ways to count the number of characters in a file within vim.

The 'n' modifier to the substitute operator ('s') tells vim only to report the number of matches, but not to actually do any substitutions.  Given that, the following command will report the number of characters in our file:

:%s/.//gn

That says, "find ('s///n') every occurrence ('g') of any character ('.') in all lines ('%')."  Note that this will not include any line break characters.  To count line breaks, just prepend the '_' modifier to the '.', like so:

:%s/_.//gn

You can also count the characters in a specified range by using a range instead of the '%'.  To count all characters, including line breaks, from lines 20 to 80 do this:

:20,80s/_.//gn

Happy vimming!


dspam not checking emails

Posted by: Brad Waite

Tagged in: Untagged 

I've been plagued with a recurring problem with dspam that I had been unable to track down for months.  While dspam catches nearly every spam that comes into my Inbox, I noticed a pattern of email that not only weren't flagged as spam, they apparently weren't even checked by dspam; there were no X-DSPAM headers in the email.

I spent hours tracking down whether there were any qmail aliases that were dropping mail directly into users Maildirs, since that would explain what I was seeing, but there were none to be found.

I then turned on tcpserver logging to see if maybe these spams were coming from a previously whitelisted IP in my tcp.smtp.  Nothing there either.

After all of this, I happened to notice an error in my qmail logs:

delivery 76475: success: 19066:_[10/05/2009_13:02:17]_message_too_big,_delivering/did_0+0+1/

I dunno why I didn't notice that before, but that error explained the problem.  dspam has a setting in dspam.com called "MaxMessageSize".  Any emails larger than this are passed through without any spam checking.  The idea is that you don't want dspam slowing down your mail server by choking through emails with 200MB binary attachments.  In my case, MaxMessageSize was set to 300KB, and sure enough, most of the spams getting through were larger than 300KB. After bumping it up to 2MB, nearly all of them have been stopped.


AUser Manager

Posted by: Brad Waite

Tagged in: Joomla , Captcha , AUser Manager

AUser Manager is a fantastic component that adds much-needed security to the Joomla! user registration process.  It transparently adds full-featured AJAX Captcha and RBL tests, all without any hacks or modifications to core Joomla! files.

I've attached a slightly modified version of 1.5.10 that has the following "improvements":

  • AUser Manager parameter options to disable the strong password generator, username available AJAX notice & password meter.  They're enabled by default, but it gives the admin the option to make com_auser look more like com_user if they wish.
  • en-US language file that's a bit more native for those of us on this side of the pond.
  • Audio digits and letters taken from Pat Fleet's (the voice of AT&T) freeware IVR prompts for asterisk.  They're easier to understand than the computer-generated mp3s.
  • Minor spelling changes (my obsessive-compulsive nature kicked in on these)

Hope my effort adds to this already great component.  Long live Joomla!

Download


Vim: insert line numbers

Posted by: Brad Waite

Tagged in: vim , search and replace

To insert line numbers at the beginning of every line in Vim, do this:

:%s/^/\=line(".")

This is a global search and replace (%s/[search]/[replace]), matching the beginning of each line (^) with an expression (\=) that returns the current line number (line(".")).

To add a comma (or any other character) after the line number, do it like this:

:%s/^/\=line(".") . ","

Any valid Vim expression is valid after the "\=", so you could start line number with 100 like this:

:%s/^/\=100+line(".")

Pretty simple and straightforward, when you think about it.


remote syslog

Posted by: Brad Waite

syslog is the standard system logging tool on most flavors of unix.  It stores records of system events in a number of user-defined files.  On FreeBSD, these files live by default in /var/log.  syslog has a configuration file, /etc/syslog.conf, that determines which file to log system events.

What a lot of people don't know is that syslog can send events to a remote syslog server.  This means you can have all of your logs from multiple machines stored on a single host automatically.  Both the client and the server need properly configured config files.

Here's a quick primer on syslog.conf:

The file is broken into sections separated by program and/or hostname definition lines.  Program definitions are in the form of "!program" (ex: !httpd), while hostname definitions take the form "+hostname" (ex: +10.0.1.1).  "-" (minus) can be used to exclude a program or hostname, so "!-httpd" would log every program except httpd.  Multiple definitions can be separated by commas (ex: !httpd, qmail).  Program and hostname definitions can be reset by using "*" (!* or +*).

Each section contains rules that are valid for only the most recent program and hostname definitions.  Rules are made up of two fields, the selector and the action, separated by tabs or spaces.  Selectors define what types of messages and priorities and the action defines what to do with those messages.

Selectors look like this: facility.level, where facility is the part of the system that generated the message and can be one of the following keywords:

auth, authpriv, console, cron, daemon, ftp, kern, lpr, mail, mark, news, ntp, security, syslog, user, uucp and local0 through local7.

Multiple selectors can be separated by a ';'.

Level defines the severity of the message and can be one of the following keywords:

emerg, alert, err, warning,  notice, info and debug.

You can also use the comparison flags !, <, > and = after the "." to specify a range of severity levels.  For example, "mail.<=notice" would log all notice, info and debug messages for the mail facility.  The default flag is >=, so "mail.alert" would log alert and emerg messages for mail.

The action field can take one of five forms:

  • the full path of a file to which the message is appended (/var/log/mail.log)
  • a hostname on which a syslog server is listening (@hostname)
  • a comma-separated list of users who will see messages on their terminal if logged in (root, ecarter)
  • a *, which writes the message to all logged-in users
  • a | followed by a command (| mail admin)

So how do you combine these configuration parameters to set up remote syslog?  Let's look at a simple example with a server named "logmaster" and a client named "mailhost".

Server Config

# Log all messages coming from mailhost
+mailhost
*.* /var/log/mailhost/mail.log

Client Config

# local logging
*.notice;kern.debug;lpr.info /var/log/messages

# send all mail messages to logmaster
mail.* @logmaster

This is a basic remote syslog configuration that simply sends all messages on mailhost with a "mail" facility to logmaster, where they are appended to a file.  This isn't necessarily idea since all the mail messages, regardless of their originating program, are globbed into a single file.

By using the program definition after the host definition in the syslog.conf on the server, we can separate log messages based on which program they came from.

Server Config

# Log all messages coming from mailhost
+mailhost

# log qmail messages of all facilities and severities
!qmail
*.* /var/log/mailhost/qmail.log

# log messages from imapd of all facilities and severities
!imapd
*.* /var/log/mailhost/imapd.log

# log messages from POP daemon of all facilities and only err+ severity
!pop3d
*.err /var/log/mailhost/pop3d.log

In this config, we're taking all the mail syslog messages from mailhost and saving ones from qmail, imapd and pop3d to their respective files.

You can probably see how using different combinations of the program and hostname definitions and syslog facilities can match any message and direct it where you wish.

That's remote syslog logging in a nutshell.

 


Shutdown or Restart Vista from Remote Desktop

Posted by: Brad Waite

Tagged in: Untagged 

Here's a couple of ways to do that:

  • Alt-F4 brings up the standard shutdown dialog.
  • From a command prompt type: shutdown -r -t 0

Hide Desktop.ini files in Windows 7 & Vista

Posted by: Brad Waite

Tagged in: Untagged 

One of the first things I do when setting up a new Windows machine is enable the option to "Show hidden files and folders" and disable the option to "Hide protected operating system files (Recommended)".  You can do this in the Folder Options control panel on the View tab.  Then I can see everything on my drive, even the stuff Microsoft thinks is too dangerous for me to see.

On Win 2K and XP, this works great, but in Windows 7 / Vista there's a little snag.  The Desktop.ini file shows up on the desktop.  Not only that, there's two copies that show up!

Here's how to fix that:

* Open windows explorer
* Select the desktop in the "folders" tree view
* From the "Organize" menu choose "Folder and Search Options"
* Click the "View" tab
* Check the "Do not show hidden files and folders" (this ONLY applies to the desktop folder)


"Delete" renamed Startup folder

Posted by: Brad Waite

Tagged in: Untagged 

When I was trying to troubleshoot some XP startup issues, I decided to rename the Start->Programs->Startup folder to "_Startup", thinking it would disable any startup programs.

Turns out the folder is somehow flagged as a startup folder, and no matter what it's called, the programs will still start up.  In addition, Windows will create a new Startup folder.  Now I've got both a "Startup" and a "_Startup" in my Programs.  I can't delete either one since Windows complains with:

<FOLDER> is a Windows system folder and is required for Windows to run properly. It can not be deleted.

To fix this, I had to move the _Startup folder, rename it, then move it back.  The folder actually reside in "C:\Documents and Settings\[user]\Start Menu\Programs".   I dragged _Startup up to the "Start Menu" folder, renamed it back to "Startup", then dragged it back to "Programs".  I clicked okay when Windows asked if I wanted to copy on top of the current folder.