Thursday, January 29, 2009

Stateful Parser in Python

While working on AIX command parsers for the next release of Zenoss, I came up with a method for parsing the output of system commands that require knowledge of previous lines to provide context. An example of this is parsing the MAC addresses for interfaces on a Mac running OS X Leopard using the system_profiler command. The output of the command has various levels of sections. The first level of section headers aren't indented at all. The next level is indented four spaces and the next six spaces.

For this task I want to parse out the interface names, which are indented four spaces, but there are plenty of similar subsection headers that are not interfaces. The context I need is whether or not the current line is in the Network section.

I came up with a way to implement this where the Parser is a class and has an instance variable named "state" that holds the correct method to use to parse the next line.

Notice that h2Pattern matches any subsection header, but I only use it to match a line if the line is inside the Network section (i.e. it is only used inside the network method.)

Here is the code:

#! /usr/bin/env python

import sys
import re
from pprint import pprint
from subprocess import Popen, PIPE, STDOUT

class Parser(object):

interfaces = []
h1Pattern = re.compile(r"[^\s]")
h2Pattern = re.compile(r"\s{4}\S")

def __init__(self):
self.state = self.outside

def outside(self, line):
if line.startswith("Network:"):
self.state = self.network

def network(self, line):
if self.h1Pattern.match(line):
self.state = self.outside
else:
if self.h2Pattern.match(line):
self.interfaces.append([line.strip()[:-1]])
elif line.strip().startswith("MAC Address:"):
self.interfaces[-1].append(line.split()[-1])

def main(filename=None):
if filename:
file = open(filename)
else:
popen = Popen("/usr/sbin/system_profiler", stdout=PIPE, stderr=STDOUT)
file = popen.stdout
parser = Parser()
for line in file:
parser.state(line)
pprint(parser.interfaces)

if __name__ == '__main__':
main(*sys.argv[1:2])

Here is the output of my program:

py$ ./system_profiler.py
[['Bluetooth'],
['Ethernet', '00:22:41:21:0b:77'],
['FireWire', '00:21:e9:ff:fe:ce:eb:1a'],
['AirPort', '00:21:e9:e1:10:0f']]
py$

Here is my system_profiler output:

Hardware:

Hardware Overview:

Model Name: MacBook Pro
Model Identifier: MacBookPro4,1
Processor Name: Intel Core 2 Duo
Processor Speed: 2.4 GHz
Number Of Processors: 1
Total Number Of Cores: 2
L2 Cache: 3 MB
Memory: 4 GB
Bus Speed: 800 MHz
Boot ROM Version: MBP41.00C1.B03
SMC Version: 1.27f1
Serial Number: W88284C7YJX
Sudden Motion Sensor:
State: Enabled

Network:

Bluetooth:

Type: PPP (PPPSerial)
Hardware: Modem
BSD Device Name: Bluetooth-Modem
Has IP Assigned: No
IPv4:
Configuration Method: PPP
IPv6:
Configuration Method: Automatic
Proxies:
FTP Passive Mode: Yes

Ethernet:

Type: Ethernet
Hardware: Ethernet
BSD Device Name: en0
Has IP Assigned: No
IPv4:
Configuration Method: DHCP
IPv6:
Configuration Method: Automatic
Proxies:
Exceptions List: *.local, 169.254/16
FTP Passive Mode: Yes
Ethernet:
MAC Address: 00:22:41:21:0b:77
Media Options:
Media Subtype: Auto Select

FireWire:

Type: FireWire
Hardware: FireWire
BSD Device Name: fw0
Has IP Assigned: No
IPv4:
Configuration Method: DHCP
IPv6:
Configuration Method: Automatic
Proxies:
Exceptions List: *.local, 169.254/16
FTP Passive Mode: Yes
Ethernet:
MAC Address: 00:21:e9:ff:fe:ce:eb:1a
Media Options: Full Duplex
Media Subtype: Auto Select

AirPort:

Type: AirPort
Hardware: AirPort
BSD Device Name: en1
Has IP Assigned: Yes
IPv4 Addresses: 192.168.1.100
IPv4:
Addresses: 192.168.1.100
Configuration Method: DHCP
Interface Name: en1
NetworkSignature: IPv4.Router=192.168.1.1;IPv4.RouterHardwareAddress=00:12:17:1b:e5:56
Router: 192.168.1.1
Subnet Masks: 255.255.255.0
IPv6:
Configuration Method: Automatic
AppleTalk:
Configuration Method: Node
Default Zone: *
Interface Name: en1
Network ID: 65463
Node ID: 122
DNS:
Domain Name: austin.rr.com
Server Addresses: 24.93.41.127, 24.93.41.128
DHCP Server Responses:
Domain Name: austin.rr.com
Domain Name Servers: 24.93.41.127,24.93.41.128
Lease Duration (seconds): 0
DHCP Message Type: 0x05
Routers: 192.168.1.1
Server Identifier: 192.168.1.1
Subnet Mask: 255.255.255.0
Proxies:
Exceptions List: *.local, 169.254/16
FTP Passive Mode: Yes
Ethernet:
MAC Address: 00:21:e9:e1:10:0f
Media Options:
Media Subtype: Auto Select

Software:

System Software Overview:

System Version: Mac OS X 10.5.6 (9G55)
Kernel Version: Darwin 9.6.0
Boot Volume: Macintosh HD
Boot Mode: Normal
Computer Name: napoleon_xiv
User Name: Brian Edwards (bedwards)
Time since boot: 1:28

ATA:

ATA Bus:

HL-DT-ST DVDRW GSA-S10N:

Model: HL-DT-ST DVDRW GSA-S10N
Revision: AP12
Serial Number: K8S86BA5845
Detachable Drive: No
Protocol: ATAPI
Unit Number: 0
Socket Type: Internal
Low Power Polling: Yes
Power Off: Yes

Audio (Built In):

Intel High Definition Audio:

Device ID: 0x106B00A3
Audio ID: 56
Available Devices:
Speaker:
Connection: Internal
Headphone:
Connection: Combo
Microphone:
Connection: Internal
Line In:
Connection: Combo
S/P-DIF Out:
Connection: Combo
S/P-DIF In:
Connection: Combo

Bluetooth:

Apple Bluetooth Software Version: 2.1.3f8
Hardware Settings:
napoleon_xiv:
Address: 00-1f-f3-ad-ec-2d
Manufacturer: Broadcom
Firmware Version: 135 (199)
Bluetooth Power: On
Discoverable: Yes
Requires Authentication: No
Services:
Bluetooth File Transfer:
Folder other devices can browse: ~/Public
Requires Authentication: Yes
State: Enabled
Bluetooth File Exchange:
Folder for accepted items: ~/Downloads
Requires Authentication: No
When other items are accepted: Ask
When PIM items are accepted: Ask
When receiving items: Prompt for each file
State: Enabled
Devices (Paired, Favorites, etc):
bedwards’s mouse:
Name: bedwards’s mouse
Address: 00-1e-52-cc-1e-96
Type: Mouse
Firmware Version: 512
Services: Mighty Mouse
Paired: Yes
Favorite: No
Connected: No
Manufacturer: Broadcom ($2, $314)
Incoming Serial Ports:
Serial Port 1:
Name: Bluetooth-PDA-Sync
RFCOMM Channel: 3
Requires Authentication: No
Outgoing Serial Ports:
Serial Port 1:
Address:
Name: Bluetooth-Modem
RFCOMM Channel: 0
Requires Authentication: No

Diagnostics:

Power On Self-Test:

Last Run: 1/29/09 8:35 PM
Result: Passed

Disc Burning:

HL-DT-ST DVDRW GSA-S10N:

Firmware Revision: AP12
Interconnect: ATAPI
Burn Support: Yes (Apple Shipping Drive)
Cache: 2048 KB
Reads DVD: Yes
CD-Write: -R, -RW
DVD-Write: -R, -R DL, -RW, +R, +R DL, +RW
Write Strategies: CD-TAO, CD-SAO, CD-Raw, DVD-DAO
Media: Insert media and refresh to show available burn speeds

FireWire:

FireWire Bus:

Maximum Speed: Up to 800 Mb/sec

Graphics/Displays:

GeForce 8600M GT:

Chipset Model: GeForce 8600M GT
Type: Display
Bus: PCIe
PCIe Lane Width: x16
VRAM (Total): 256 MB
Vendor: NVIDIA (0x10de)
Device ID: 0x0407
Revision ID: 0x00a1
ROM Revision: 3212
Displays:
Color LCD:
Resolution: 1280 x 800
Depth: 32-bit Color
Core Image: Hardware Accelerated
Main Display: Yes
Mirror: Off
Online: Yes
Quartz Extreme: Supported
Built-In: Yes
Display Connector:
Status: No display connected

Memory:

BANK 0/DIMM0:

Size: 2 GB
Type: DDR2 SDRAM
Speed: 667 MHz
Status: OK
Manufacturer: 0xCE00000000000000
Part Number: 0x4D342037305435363633435A332D43453620
Serial Number: 0x48B2278A

BANK 1/DIMM1:

Size: 2 GB
Type: DDR2 SDRAM
Speed: 667 MHz
Status: OK
Manufacturer: 0xCE00000000000000
Part Number: 0x4D342037305435363633435A332D43453620
Serial Number: 0x48B227A3

Power:

Battery Information:

Model Information:
Serial Number: SMP-ASMB012-38e7-61e
Manufacturer: SMP
Device name: ASMB012
Pack Lot Code: 0002
PCB Lot Code: 0000
Firmware Version: 0110
Hardware Revision: 0500
Cell Revision: 0200
Charge Information:
Charge remaining (mAh): 4929
Fully charged: Yes
Charging: No
Full charge capacity (mAh): 5119
Health Information:
Cycle count: 71
Condition: Good
Battery Installed: Yes
Amperage (mA): -999
Voltage (mV): 12262

System Power Settings:

AC Power:
System Sleep Timer (Minutes): 10
Disk Sleep Timer (Minutes): 10
Display Sleep Timer (Minutes): 10
Automatic Restart On Power Loss: No
Wake On AC Change: No
Wake On Clamshell Open: Yes
Wake On LAN: Yes
Display Sleep Uses Dim: Yes
Battery Power:
System Sleep Timer (Minutes): 10
Disk Sleep Timer (Minutes): 10
Display Sleep Timer (Minutes): 2
Wake On AC Change: No
Wake On Clamshell Open: Yes
Display Sleep Uses Dim: Yes
Reduce Brightness: Yes

Hardware Configuration:

UPS Installed: No

AC Charger Information:

Connected: No
Charging: No

Printers:

HP LaserJet M2727nf MFP (0B4309):

Status: Idle
Print Server: Local
Driver Version: 10.4
Default: Yes
URI: mdns://HP%20LaserJet%20M2727nf%20MFP%20%280B4309%29._pdl-datastream._tcp.local./?bidi
PPD: Generic PostScript Printer
PPD File Version: 1.0
PostScript Version: (2000.0) 1

Serial-ATA:

Intel ICH8-M AHCI:

Vendor: Intel
Product: ICH8-M AHCI
Speed: 1.5 Gigabit
Description: AHCI Version 1.10 Supported

FUJITSU MHY2200BH:

Capacity: 186.31 GB
Model: FUJITSU MHY2200BH
Revision: 0081000D
Serial Number: K43BT862EMP5
Native Command Queuing: Yes
Queue Depth: 32
Removable Media: No
Detachable Drive: No
BSD Name: disk0
Mac OS 9 Drivers: No
Partition Map Type: GPT (GUID Partition Table)
S.M.A.R.T. status: Verified
Volumes:
Macintosh HD:
Capacity: 185.99 GB
Available: 96.48 GB
Writable: Yes
File System: Journaled HFS+
BSD Name: disk0s2
Mount Point: /
Volumes:
disk0s2:
Capacity: 185.99 GB
Available: 96.48 GB
Writable: Yes
File System: Journaled HFS+

USB:

USB High-Speed Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBEHCI
PCI Device ID: 0x283a
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0xfa

USB High-Speed Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBEHCI
PCI Device ID: 0x2836
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0xfd

Built-in iSight:

Product ID: 0x8502
Vendor ID: 0x05ac (Apple Inc.)
Version: 1.60
Serial Number: 8T86M09BK0003L00
Speed: Up to 480 Mb/sec
Manufacturer: Apple Inc.
Location ID: 0xfd400000
Current Available (mA): 500
Current Required (mA): 500

USB Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBUHCI
PCI Device ID: 0x2834
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0x1a

BCM2045B2:

Product ID: 0x4500
Vendor ID: 0x0a5c (Broadcom Corp.)
Version: 1.00
Speed: Up to 12 Mb/sec
Manufacturer: Broadcom
Location ID: 0x1a100000
Current Available (mA): 500
Current Required (mA): 0

Bluetooth USB Host Controller:

Product ID: 0x820f
Vendor ID: 0x05ac (Apple Inc.)
Version: 0.37
Serial Number: 001FF3ADEC2D
Speed: Up to 12 Mb/sec
Manufacturer: Apple, Inc.
Location ID: 0x1a110000
Current Available (mA): 500
Current Required (mA): 0

USB Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBUHCI
PCI Device ID: 0x2835
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0x3a

USB Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBUHCI
PCI Device ID: 0x2832
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0x5d

Apple Internal Keyboard / Trackpad:

Product ID: 0x0230
Vendor ID: 0x05ac (Apple Inc.)
Version: 0.70
Speed: Up to 12 Mb/sec
Manufacturer: Apple, Inc.
Location ID: 0x5d200000
Current Available (mA): 500
Current Required (mA): 40

IR Receiver:

Product ID: 0x8242
Vendor ID: 0x05ac (Apple Inc.)
Version: 0.16
Speed: Up to 1.5 Mb/sec
Manufacturer: Apple Computer, Inc.
Location ID: 0x5d100000
Current Available (mA): 500
Current Required (mA): 100

USB Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBUHCI
PCI Device ID: 0x2830
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0x1d

USB Bus:

Host Controller Location: Built In USB
Host Controller Driver: AppleUSBUHCI
PCI Device ID: 0x2831
PCI Revision ID: 0x0004
PCI Vendor ID: 0x8086
Bus Number: 0x3d

AirPort Card:

AirPort Card Information:

Wireless Card Type: AirPort Extreme (0x14E4, 0x8C)
Wireless Card Locale: USA
Wireless Card Firmware Version: Broadcom BCM43xx 1.0 (5.10.38.24)
Current Wireless Network: crabs
Wireless Channel: 10

Firewall:

Firewall Settings:

Mode: Allow all incoming connections

Locations:

Automatic:

Active Location: Yes
Services:
Bluetooth:
Type: PPP
IPv4:
Configuration Method: PPP
IPv6:
Configuration Method: Automatic
Proxies:
FTP Passive Mode: Yes
PPP:
ACSP Enabled: No
Display Terminal Window: No
Redial Count: 1
Redial Enabled: Yes
Redial Interval: 5
Use Terminal Script: No
Dial On Demand: No
Disconnect On Fast User Switch: Yes
Disconnect On Idle: Yes
Disconnect On Idle Time: 600
Disconnect On Logout: Yes
Disconnect On Sleep: Yes
Idle Reminder: No
Idle Reminder Time: 1800
IPCP Compression VJ: Yes
LCP Echo Enabled: No
LCP Echo Failure: 4
LCP Echo Interval: 10
Log File: /var/log/ppp.log
Verbose Logging: No
Ethernet:
Type: Ethernet
BSD Device Name: en0
Hardware (MAC) Address: 00:22:41:21:0b:77
IPv4:
Configuration Method: DHCP
IPv6:
Configuration Method: Automatic
AppleTalk:
Configuration Method: Node
Proxies:
Exceptions List: *.local, 169.254/16
FTP Passive Mode: Yes
FireWire:
Type: FireWire
BSD Device Name: fw0
Hardware (MAC) Address: 00:21:e9:ff:fe:ce:eb:1a
IPv4:
Configuration Method: DHCP
IPv6:
Configuration Method: Automatic
Proxies:
Exceptions List: *.local, 169.254/16
FTP Passive Mode: Yes
AirPort:
Type: IEEE80211
BSD Device Name: en1
Hardware (MAC) Address: 00:21:e9:e1:10:0f
IPv4:
Configuration Method: DHCP
IPv6:
Configuration Method: Automatic
AppleTalk:
Configuration Method: Node
Proxies:
Exceptions List: *.local, 169.254/16
FTP Passive Mode: Yes
IEEE80211:
Join Mode: Automatic
PowerEnabled: 1
PreferredNetworks:
SecurityType: WPA2 Personal
SSID_STR: ZenossRuss
Unique Network ID: D8CDCCF9-9FD5-4C80-8CDA-8160F1659EEF
Unique Password ID: BA30ABE4-A732-4406-8AA2-60012FB084E1
SecurityType: Open
SSID_STR: crabs
Unique Network ID: D3F14C00-EBBE-43B6-BFF2-498083FF9098
SecurityType: WPA2 Personal
SSID_STR: ZenossEast
Unique Network ID: 10FFCB25-046C-4012-BCAB-A8E8340BF3EF
Unique Password ID: 173998E7-2F01-4F9A-A859-36698B107F4B
SecurityType: WPA2 Personal
SSID_STR: ZenossWest
Unique Network ID: 797AF9FB-020A-451E-B2FE-799FC173B9A1
Unique Password ID: 24491972-3815-4BEC-8821-679091B2C73F
SecurityType: WEP
SSID_STR: zenoss3
Unique Network ID: DF9CB19F-9210-49C2-BBBC-12760A79D0DF
Unique Password ID: 489422C2-D919-4D29-A562-65B26F15E61C
SecurityType: WPA2 Personal
SSID_STR: HatchettWiFI
Unique Network ID: EFCFE437-5BFD-47A4-998F-2AC6838731D4
Unique Password ID: FC890BE3-C54C-4F44-ADC7-CE020961FA64
SecurityType: Open
SSID_STR: sedona
Unique Network ID: BA0AB5D9-569D-4F03-9F02-0277F39EF0E4
SecurityType: Open
SSID_STR: onioncreek
Unique Network ID: D3257C44-33F5-4D1F-A08D-043C14F40A0D

Volumes:

net:

Type: autofs
Mount Point: /net
Mounted From: map -hosts
Automounted: Yes

home:

Type: autofs
Mount Point: /home
Mounted From: map auto_home
Automounted: Yes

Universal Access:

Universal Access Information:

Cursor Magnification: Off
Display: Black on White
Flash Screen: Off
Mouse Keys: Off
Slow Keys: Off
Sticky Keys: Off
VoiceOver: Off
Zoom: Off

1 comment:

pm_soipel said...

thanks for this. very useful for me, i'm extracting usb-device information form system_profiler's output.
in case you didn't know, you can restrict the output of system_profiler by passing a datatype to it:

system_profiler SPNetworkDataType

(also check out system_profiler -listdatatypes)