|
Groovy/FAQ/Доступ к файлам
Материал из Wiki.crossplatform.ru
[править] Introduction
//----------------------------------------------------------------------------------
//testfile = new File('/usr/local/widgets/data') // unix
testfile = new File('Pleac/data/blue.txt') // windows
testfile.eachLine{ if (it =~ /blue/) println it }
// Groovy (like Java) uses the File class as an abstraction for
// the path representing a potential file system resource.
// Channels and Streams (along with Reader adn Writer helper
// classes) are used to read and write to files (and other
// things). Files, channels, streams etc are all "normal"
// objects; they can be passed around in your programs just
// like other objects (though there are some restrictions
// covered elsewhere - e.g. you can't expect to pass a File
// object between JVMs on different machines running different
// operating systems and expect them to maintain a meaningful
// value across the different JVMs). In addition to Streams,
// there is also support for random access to files.
// Many operations are available on streams and channels. Some
// return values to indicate success or failure, some can throw
// exceptions, other times both styles of error reporting may be
// available.
// Streams at the lowest level are just a sequence of bytes though
// there are various abstractions at higher levels to allow
// interacting with streams at encoded character, data type or
// object levels if desired. Standard streams include System.in,
// System.out and System.err. Java and Groovy on top of that
// provide facilities for buffering, filtering and processing
// streams in various ways.
// File channels provide more powerful operations than streams
// for reading and writing files such as locks, buffering,
// positioning, concurrent reading and writing, mapping to memory
// etc. In the examples which follow, streams will be used for
// simple cases, channels when more advanced features are
// required. Groovy currently focusses on providing extra support
// at the file and stream level rather than channel level.
// This makes the simple things easy but lets you do more complex
// things by just using the appropriate Java classes. All Java
// classes are available within Groovy by default.
// Groovy provides syntactic sugar over the top of Java's file
// processing capabilities by providing meaning to shorthand
// operators and by automatically handling scaffolding type
// code such as opening, closing and handling exceptions behind
// the scenes. It also provides many powerful closure operators,
// e.g. file.eachLineMatch(pattern){ some_operation } will open
// the file, process it line-by-line, finding all lines which
// match the specified pattern and then invoke some operation
// for the matching line(s) if any, before closing the file.
// this example shows how to access the standard input stream
// numericCheckingScript:
prompt = '\n> '
print 'Enter text including a digit:' + prompt
new BufferedReader(new InputStreamReader(System.in)).eachLine{ line ->
// line is read from System.in
if (line =~ '\\d') println "Read: $line" // normal output to System.out
else System.err.println 'No digit found.' // this message to System.err
}
//----------------------------------------------------------------------------------
[править] Opening a File
//----------------------------------------------------------------------------------
// test values (change for your os and directories)
inputPath='Pleac/src/pleac7.groovy'; outPath='Pleac/temp/junk.txt'
// For input Java uses InputStreams (for byte-oriented processing) or Readers
// (for character-oriented processing). These can throw FileNotFoundException.
// There are also other stream variants: buffered, data, filters, objects, ...
inputFile = new File(inputPath)
inputStream = new FileInputStream(inputFile)
reader = new FileReader(inputFile)
inputChannel = inputStream.channel
// Examples for random access to a file
file = new RandomAccessFile(inputFile, "rw") // for read and write
channel = file.channel
// Groovy provides some sugar coating on top of Java
println inputFile.text.size()
// => 13496
// For output Java use OutputStreams or Writers. Can throw FileNotFound
// or IO exceptions. There are also other flavours of stream: buffered,
// data, filters, objects, ...
outFile = new File(outPath)
appendFlag = false
outStream = new FileOutputStream(outFile, appendFlag)
writer = new FileWriter(outFile, appendFlag)
outChannel = outStream.channel
// Also some Groovy sugar coating
outFile << 'A Chinese sailing vessel'
println outFile.text.size() // => 24
[править] Opening Files with Unusual Filenames
//----------------------------------------------------------------------------------
// No problem with Groovy since the filename doesn't contain characters with
// special meaning; like Perl's sysopen. Options are either additional parameters
// or captured in different classes, e.g. Input vs Output, Buffered vs non etc.
new FileReader(inputPath)
//----------------------------------------------------------------------------------
[править] Expanding Tildes in Filenames
//----------------------------------------------------------------------------------
// '~' is a shell expansion feature rather than file system feature per se.
// Because '~' is a valid filename character in some operating systems, and Java
// attempts to be cross-platform, it doesn't automatically expand Tilde's.
// Given that '~' expansion is commonly used however, Java puts the $HOME
// environment variable (used by shells to do typical expansion) into the
// "user.home" system property. This works across operating systems - though
// the value inside differs from system to system so you shouldn't rely on its
// content to be of a particular format. In most cases though you should be
// able to write a regex that will work as expected. Also, Apple's
// NSPathUtilities can expand and introduce Tildes on platforms it supports.
path = '~paulk/.cvspass'
name = System.getProperty('user.name')
home = System.getProperty('user.home')
println home + path.replaceAll("~$name(.*)", '$1')
// => C:\Documents and Settings\Paul/.cvspass
//----------------------------------------------------------------------------------
[править] Making Perl Report Filenames in Errors
//----------------------------------------------------------------------------------
// The exception raised in Groovy reports the filename
try {
new File('unknown_path/bad_file.ext').text
} catch (Exception ex) {
System.err.println(ex.message)
}
// =>
// unknown_path\bad_file.ext (The system cannot find the path specified)
//----------------------------------------------------------------------------------
[править] Creating Temporary Files
//----------------------------------------------------------------------------------
try {
temp = File.createTempFile("prefix", ".suffix")
temp.deleteOnExit()
} catch (IOException ex) {
System.err.println("Temp file could not be created")
}
//----------------------------------------------------------------------------------
[править] Storing Files Inside Your Program Text
//----------------------------------------------------------------------------------
// no special features are provided, here is a way to do it manually
// DO NOT REMOVE THE FOLLOWING STRING DEFINITION.
pleac_7_6_embeddedFileInfo = '''
Script size is 13731
Last script update: Wed Jan 10 19:05:58 EST 2007
'''
ls = System.getProperty('line.separator')
file = new File('Pleac/src/pleac7.groovy')
regex = /(?ms)(?<=^pleac_7_6_embeddedFileInfo = ''')(.*)(?=^''')/
def readEmbeddedInfo() {
m = file.text =~ regex
println 'Found:\n' + m[0][1]
}
def writeEmbeddedInfo() {
lastMod = new Date(file.lastModified())
newInfo = "${ls}Script size is ${file.size()}${ls}Last script update: ${lastMod}${ls}"
file.write(file.text.replaceAll(regex, newInfo))
}
readEmbeddedInfo()
// writeEmbeddedInfo() // uncomment to make script update itself
// readEmbeddedInfo() // uncomment to redisplay the embedded info after the update
// => (output when above two method call lines are uncommented)
// Found:
//
// Script size is 13550
// Last script update: Wed Jan 10 18:56:03 EST 2007
//
// Found:
//
// Script size is 13731
// Last script update: Wed Jan 10 19:05:58 EST 2007
//----------------------------------------------------------------------------------
[править] Writing a Filter
//----------------------------------------------------------------------------------
// general pattern for reading from System.in is:
// System.in.readLines().each{ processLine(it) }
// general pattern for a filter which can either process file args or read from System.in is:
// if (args.size() != 0) args.each{
// file -> new File(file).eachLine{ processLine(it) }
// } else System.in.readLines().each{ processLine(it) }
// note: the following examples are file-related per se. They show
// how to do option processing in scenarios which typically also
// involve file arguments. The reader should also consider using a
// pre-packaged options parser package (there are several popular
// ones) rather than the hard-coded processing examples shown here.
chopFirst = false
columns = 0
args = ['-c', '-30', 'somefile']
// demo1: optional c
if (args[0] == '-c') {
chopFirst = true
args = args[1..-1]
}
assert args == ["-30", "somefile"]
assert chopFirst
// demo2: processing numerical options
if (args[0] =~ /^-(\d+)$/) {
columns = args[0][1..-1].toInteger()
args = args[1..-1]
}
assert args == ["somefile"]
assert columns == 30
// demo3: multiple args (again consider option parsing package)
args = ['-n','-a','file1','file2']
nostdout = false
append = false
unbuffer = false
ignore_ints = false
files = []
args.each{ arg ->
switch(arg) {
case '-n': nostdout = true; break
case '-a': append = true; break
case '-u': unbuffer = true; break
case '-i': ignore_ints = true; break
default: files += arg
}
}
if (files.any{ it.startsWith('-')}) {
System.err.println("usage: demo3 [-ainu] [filenames]")
}
// process files ...
assert nostdout && append && !unbuffer && !ignore_ints
assert files == ['file1','file2']
// find login: print all lines containing the string "login" (command-line version)
//% groovy -ne "if (line =~ 'login') println line" filename
// find login variation: lines containing "login" with line number (command-line version)
//% groovy -ne "if (line =~ 'login') println count + ':' + line" filename
// lowercase file (command-line version)
//% groovy -pe "line.toLowerCase()"
// count chunks but skip comments and stop when reaching "__DATA__" or "__END__"
chunks = 0; done = false
testfile = new File('Pleac/data/chunks.txt') // change on your system
lines = testfile.readLines()
for (line in lines) {
if (!line.trim()) continue
words = line.split(/[^\w#]+/).toList()
for (word in words) {
if (word =~ /^#/) break
if (word in ["__DATA__", "__END__"]) { done = true; break }
chunks += 1
}
if (done) break
}
println "Found $chunks chunks"
// groovy "one-liner" (cough cough) for turning .history file into pretty version:
//% groovy -e "m=new File(args[0]).text=~/(?ms)^#\+(\d+)\r?\n(.*?)$/;(0..<m.count).each{println ''+new Date(m[it][1].toInteger())+' '+m[it][2]}" .history
// =>
// Sun Jan 11 18:26:22 EST 1970 less /etc/motd
// Sun Jan 11 18:26:22 EST 1970 vi ~/.exrc
// Sun Jan 11 18:26:22 EST 1970 date
// Sun Jan 11 18:26:22 EST 1970 who
// Sun Jan 11 18:26:22 EST 1970 telnet home
//----------------------------------------------------------------------------------
[править] Modifying a File in Place with Temporary File
//----------------------------------------------------------------------------------
// test data for below
testPath = 'Pleac/data/process.txt'
// general pattern
def processWithBackup(inputPath, Closure processLine) {
def input = new File(inputPath)
def out = File.createTempFile("prefix", ".suffix")
out.write('') // create empty file
count = 0
input.eachLine{ line ->
count++
processLine(out, line, count)
}
def dest = new File(inputPath + ".orig")
dest.delete() // clobber previous backup
input.renameTo(dest)
out.renameTo(input)
}
// use withPrintWriter if you don't want the '\n''s appearing
processWithBackup(testPath) { out, line, count ->
if (count == 20) { // we are at the 20th line
out << "Extra line 1\n"
out << "Extra line 2\n"
}
out << line + '\n'
}
processWithBackup(testPath) { out, line, count ->
if (!(count in 20..30)) // skip the 20th line to the 30th
out << line + '\n'
}
// equivalent to "one-liner":
//% groovy -i.orig -pe "if (!(count in 20..30)) out << line" testPath
//----------------------------------------------------------------------------------
[править] Modifying a File in Place with -i Switch
//----------------------------------------------------------------------------------
//% groovy -i.orig -pe 'FILTER COMMAND' file1 file2 file3 ...
// the following may also be possible on unix systems (unchecked)
//#!/usr/bin/groovy -i.orig -p
// filter commands go here
// "one-liner" templating scenario: change DATE -> current time
//% groovy -pi.orig -e 'line.replaceAll(/DATE/){new Date()}'
//% groovy -i.old -pe 'line.replaceAll(/\bhisvar\b/, 'hervar')' *.[Cchy] (globbing platform specific)
// one-liner for correcting spelling typos
//% groovy -i.orig -pe 'line.replaceAll(/\b(p)earl\b/i, '\1erl')' *.[Cchy] (globbing platform specific)
//----------------------------------------------------------------------------------
[править] Modifying a File in Place Without a Temporary File
//----------------------------------------------------------------------------------
// general pattern
def processFileInplace(file, Closure processText) {
def text = file.text
file.write(processText(text))
}
// templating scenario: change DATE -> current time
testfile = new File('Pleac/data/pleac7_10.txt') // replace on your system
processFileInplace(testfile) { text ->
text.replaceAll(/(?m)DATE/, new Date().toString())
}
//----------------------------------------------------------------------------------
[править] Locking a File
//----------------------------------------------------------------------------------
// You need to use Java's Channel class to acquire locks. The exact
// nature of the lock is somewhat dependent on the operating system.
def processFileWithLock(file, processStream) {
def random = new RandomAccessFile(file, "rw")
def lock = random.channel.lock() // acquire exclusive lock
processStream(random)
lock.release()
random.close()
}
// Instead of an exclusive lock you can acquire a shared lock.
// Also, you can acquire a lock for a region of a file by specifying
// start and end positions of the region when acquiring the lock.
// For non-blocking functionality, use tryLock() instead of lock().
def processFileWithTryLock(file, processStream) {
random = new RandomAccessFile(file, "rw")
channel = random.channel
def MAX_ATTEMPTS = 30
for (i in 0..<MAX_ATTEMPTS) {
lock = channel.tryLock()
if (lock != null) break
println 'Could not get lock, pausing ...'
Thread.sleep(500) // 500 millis = 0.5 secs
}
if (lock == null) {
println 'Unable to acquire lock, aborting ...'
} else {
processStream(random)
lock.release()
}
random.close()
}
// non-blocking multithreaded example: print first line while holding lock
Thread.start{
processFileWithLock(testfile) { source ->
println 'First reader: ' + source.readLine().toUpperCase()
Thread.sleep(2000) // 2000 millis = 2 secs
}
}
processFileWithTryLock(testfile) { source ->
println 'Second reader: ' + source.readLine().toUpperCase()
}
// =>
// Could not get lock, pausing ...
// First reader: WAS LOWERCASE
// Could not get lock, pausing ...
// Could not get lock, pausing ...
// Could not get lock, pausing ...
// Could not get lock, pausing ...
// Second reader: WAS LOWERCASE
//----------------------------------------------------------------------------------
[править] Flushing Output
//----------------------------------------------------------------------------------
// In Java, input and output streams have a flush() method and file channels
// have a force() method (applicable also to memory-mapped files). When creating
// PrintWriters and // PrintStreams, an autoFlush option can be provided.
// From a FileInput or Output Stream you can ask for the FileDescriptor
// which has a sync() method - but you wouldn't you'd just use flush().
inputStream = testfile.newInputStream() // returns a buffered input stream
autoFlush = true
printStream = new PrintStream(outStream, autoFlush)
printWriter = new PrintWriter(outStream, autoFlush)
//----------------------------------------------------------------------------------
[править] Reading from Many Filehandles Without Blocking
//----------------------------------------------------------------------------------
// See the comments in 7.14 about scenarios where non-blocking can be
// avoided. Also see 7.14 regarding basic information about channels.
// An advanced feature of the java.nio.channels package is supported
// by the Selector and SelectableChannel classes. These allow efficient
// server multiplexing amongst responses from a number of potential sources.
// Under the covers, it allows mapping to native operating system features
// supporting such multiplexing or using a pool of worker processing threads
// much smaller in size than the total available connections.
//
// The general pattern for using selectors is:
//
// while (true) {
// selector.select()
// def it = selector.selectedKeys().iterator()
// while (it.hasNext()) {
// handleKey(it++)
// it.remove()
// }
// }
//----------------------------------------------------------------------------------
[править] Doing Non-Blocking I/O
//----------------------------------------------------------------------------------
// Groovy has no special support for this apart from making it easier to
// create threads (see note at end); it relies on Java's features here.
// InputStreams in Java/Groovy block if input is not yet available.
// This is not normally an issue, because if you have a potential blocking
// operation, e.g. save a large file, you normally just create a thread
// and save it in the background.
// Channels are one way to do non-blocking stream-based IO.
// Classes which implement the AbstractSelectableChannel interface provide
// a configureBlocking(boolean) method as well as an isBlocking() method.
// When processing a non-blocking stream, you need to process incoming
// information based on the number of bytes read returned by the various
// read methods. For non-blocking, this can be 0 bytes even if you pass
// a fixed size byte[] buffer to the read method. Non-blocking IO is typically
// not used with Files but more normally with network streams though they
// can when Pipes (couple sink and source channels) are involved where
// one side of the pipe is a file.
//----------------------------------------------------------------------------------
[править] Determining the Number of Bytes to Read
//----------------------------------------------------------------------------------
// Groovy uses Java's features here.
// For both blocking and non-blocking reads, the read operation returns the number
// of bytes read. In blocking operations, this normally corresponds to the number
// of bytes requested (typically the size of some buffer) but can have a smaller
// value at the end of a stream. Java also makes no guarantees about whether
// other streams in general will return bytes as they become available under
// certain circumstances (rather than blocking until the entire buffer is filled.
// In non-blocking operations, the number of bytes returned will typically be
// the number of bytes available (up to some maximum buffer or requested size).
//----------------------------------------------------------------------------------
[править] Storing Filehandles in Variables
//----------------------------------------------------------------------------------
// This just works in Java and Groovy as per the previous examples.
//----------------------------------------------------------------------------------
[править] Caching Open Output Filehandles
//----------------------------------------------------------------------------------
// Groovy uses Java's features here.
// More work has been done in the Java on object caching than file caching
// with several open source and commercial offerings in that area. File caches
// are also available, for one, see:
// http://portals.apache.org/jetspeed-1/apidocs/org/apache/jetspeed/cache/FileCache.html
//----------------------------------------------------------------------------------
[править] Printing to Many Filehandles Simultaneously
//----------------------------------------------------------------------------------
// The general pattern is: streams.each{ stream -> stream.println 'item to print' }
// See the MultiStream example in 13.5 for a coded example.
//----------------------------------------------------------------------------------
[править] Opening and Closing File Descriptors by Number
//----------------------------------------------------------------------------------
// You wouldn't normally be dealing with FileDescriptors. In case were you have
// one you would normally walk through all known FileStreams asking each for
// it's FileDescriptor until you found one that matched. You would then close
// that stream.
//----------------------------------------------------------------------------------
[править] Copying Filehandles
//----------------------------------------------------------------------------------
// There are several concepts here. At the object level, any two object references
// can point to the same object. Any changes made by one of these will be visible
// in the 'alias'. You can also have multiple stream, reader, writer or channel objects
// referencing the same resource. Depending on the kind of resource, any potential
// locks, the operations being requested and the behaviour of third-party programs,
// the result of trying to perform such concurrent operations may not always be
// deterministic. There are strategies for coping with such scenarious but the
// best bet is to avoid the issue.
// For the scenario given, copying file handles, that corresponds most closely
// with cloning streams. The best bet is to just use individual stream objects
// both created from the same file. If you are attempting to do write operations,
// then you should consider using locks.
//----------------------------------------------------------------------------------
[править] Program: netlock
//----------------------------------------------------------------------------------
// locking is built in to Java (since 1.4), so should not be missing
//----------------------------------------------------------------------------------
[править] Program: lockarea
//----------------------------------------------------------------------------------
// Java locking supports locking just regions of files.
//----------------------------------------------------------------------------------
|