Kejardon

Last modified by Kejardon on 2012/05/07 00:24

Current thoughts:
I think I want a MethodAdapter that, when
private void compileLiteral(final MethodVisitor mv, final Object o)
in Compiler.jav is reached, it instead does something like
mv.visitVarInsn(ALOAD, index);
On finalizing, the MethodAdapter will have statistics for all its variables, create its own commonLiteralsList based on those statistics, and then do
getLiteralByIndex(mv, index);
mv.visitInsn(ICONST_0);
possibly by calling Compiler.java's code, or copy-pasting it internally and tweaking it.

I might have enough plans at the moment to work out a finalish draft for this MethodAdapter stuff at the moment. Let's see.
Compiler.java works mostly the same, creating run(int) with a simple loop and nothing else, tableswitch(int) with a table of all 5712 labels and nothing else, each label calling helper.mX(), X being a number from 1 to 5712. The compiler marks all labels as needing to be a submethod(? This part can probably be cut out, there is likely a more convenient/already generated list suitable for this).
Each helper.mX() is made using the current MethodAdapter being planned. compileLiteral(mv, o) does the simplified thing listed above, just mv.visitVarInsn(ALOAD, index). Other than that, I think Compiler.java would work the same, all other conversions done within the MethodAdapter itself.
TODO: I am working on a sort of assumption that there will be no other ALOAD calls (aside from maybe ALOAD_0 which would be left alone). I need to verify that or change it to an unused opcode.
TODO: Look over Compiler.java, find and plan out any other instances of commands needing to be delayed.
As the MethodAdapter is sent opcodes, it should keep track of maxBinarySize for each opcode and a partial stackmap to plan further submethods around. Scroll down for this logic in detail.
Conversions in MethodAdapter can wait until it's finalized, I think. Not a big issue either way probably, but doing it later will probably lower the size of its internal buffer.
TODO: Look over all opcodes, plan out all conversions.

Detailed logic for the partial stackmap and planning out submethods:
Global is used here as in a variable within the MethodAdapter, not within Marks or other objects.
Keep an ArrayList of opcodes (combine their code and argument)
Keep an ArrayList of potentialSubMethods or something similar. PotentialSM will have .location, .endLocation
Keep an ArrayList of Marks. Marks will have .type (int), .stackSize (int), .location (int) (measured in opcodes passed to the MethodAdapter), .maxSizeSinceLastMark (int), .returnSinceLastMark (boolean).
Keep global values for stackSize, maxSizeSinceLastMark, returnSinceLastMark, location(? really just opcodes.size())
Each time an opcode is passed, (pre)calculate the most space it will take and add this to a global maxSizeSinceLastMark. Also add its effect to the stackSize.

Mark the start of the method (START, 0, 0, 0, false).
Each time the stack will not decrease below its current point from an opcode, add a mark before that opcode as a possible method entrance (.type==ENTRANCE). Store maxSizeSinceLastMark to the new Mark, clear the global version, then add this opcode's bytesize and stacksize to the global variables. The first Mark will end up being (ENTRANCE, 0, 0, 0, false).
Each time the stack will decrease (at all, not necessarily net) from an opcode, remove all ENTRANCE Marks from the end of the ArrayList<Mark> until one is reached that has a stack size equal or less than the minimum stack from the opcode. Each time a Mark is removed, add its maxSizeSinceLastMark to the global version, and |=returnSinceLastMark's.
If the opcode is something that will trigger a return (goto/if to an outside label, an actual return) set returnSinceLastMark. If any BRANCH Marks end up with no ENTRANCE Marks between them and their LABEL Mark, re
If the opcode branches (goto/if to a label within this code), add a Mark here (BRANCH, (Label.info instanceof Integer)?(Label.info.intValue()) emoticon_unhappy (Label.info=Integer.valueOf(labelCount++)).intValue()), location, maxSizeSinceLastMark, returnSinceLastMark)
BRANCH Marks will NOT clear maxSizeSinceLastMark or returnSinceLastMark.
After a new global stackSize is calculated for the opcode, look back from the end of the Mark list. Stop backtracking when a Mark is found with stack size at least 2 less than the global stackSize or a BRANCH or LABEL mark. Go forward again and check the next stack size. If it is 1 less or equal to global stacksize, this is a potential sub sub method. Remove all Marks after that one, calculating the total size for maxSizeSinceLastMark. If it is too large(let's say > 64000 for now) this mark is too large. Tell the last PotentialSM it is now an actual submethod, recalculate size(remove size of opcodes now in the submethod, add size for overhead. TODO calculate overhead size later, for both with return and without return. Also maybe consider a third option, for when ALWAYS return). Otherwise if the Mark fits (maxSize < 64000) make a new PotentialSM (.location=Mark.location, .endLocation=location/opcodes.size()), remove all PotentialSM's with a .location at or after the new PotentialSM's .location, then add the new PotentialSM to the ArrayList<PotentialSM>.
IN PROGRESS. There are a few holes I know about already (e.g. sub sub methods' size will not be exactly maxSizeSinceLastMark), will finish later.

When visitLabel(Label), add a new Mark

Current plans:
Working in Compiler.java. Want to recode run() to something similar to

public static FlagObject O=new FlagObject();
run(int i)
{
  O.target=i;
  while(O.continueTable)
    tableswitch(O.target);
  return O.aReturn;
}
tableswitch(int)
{
  switch(int)
  {
  case 0:
  helper.m1();
  return;
  case 1:
  helper.m2();
  return;
  ...
  case 5711:
  helper.m5712();
  return;
  }
}

also need to figure out intermediate form of code. compileStatements to SomeThing (in a MethodAdapter) to java bytecode. SomeThing needs to keep track of variable requests, optimize them to keep more commonly used variables in easily accessed locations. Maybe keep particularly common things used in all the code in fields within the helper class... this is probably best done BEFORE any individual methods. jdstroy has more knowledge in this area and can probably handle it better than me.
Form of SomeThing. I don't want to change Compiler's conversion logic a whole lot at the moment, so the MethodAdapter will probably still take in the same opcodes from Compiler. Buffer ALL of it in either an array or a list. When it is done at visitEnd, the MethodAdapter will finalize then flush its data to its MethodVisitor.
TODO: Come up with a list of opcodes that need to be reworked when finalizing (i.e. goto may become O.target=goto.target; return; - areturn become putfield O.aReturn; putfield O.continueTable; return)
Finalizing also has to keep track of size. If too large, create a stackmap and figure out optimal locations to break the submethod apart into, say, sub X (), X being a static counter for the helper class. The logic for this part is already in my personal notes.
The way I've said all this so far, subX() methods will also need their own opcode conversion scheme. Also, considering at what point the subX() methods are created, they will probably just use the same local variable optimizations as their parent mx() method. This is not entirely efficient though.

Other todo:
Detection for unreachable statements within the .ax code - skip these statements!

Kejardon

My Recent Modifications