Translate 6809 to C code

Discussion:

(too old to reply)

apraman

2006-11-03 06:21:10 UTC

Need a converter / translator that can translate my 6809 assembly to C
code. Tried with disassemblies. But of no use. Any help in this regard
would be grateful. If there is any commercial tool already available
that is affordable would also be fine.

Thanks
Ram

Mark McDougall

2006-11-03 07:27:50 UTC

Permalink

Post by apraman
Need a converter / translator that can translate my 6809 assembly to C
code. Tried with disassemblies. But of no use. Any help in this regard
would be grateful. If there is any commercial tool already available
that is affordable would also be fine.

What you ask is all-but-impossible. It's difficult enough to recover
functionality by manually disassembling code, let-alone writing a
program to do it automatically.

If this is a commercial product then your only option is to pay someone
to do it for you.

Regards,

--
Mark McDougall, Engineer
Virtual Logic Pty Ltd, <http://www.vl.com.au>
21-25 King St, Rockdale, 2216
Ph: +612-9599-3255 Fax: +612-9599-3266

Chris Croughton

2006-11-03 12:40:50 UTC

Permalink

On Fri, 03 Nov 2006 18:27:50 +1100, Mark McDougall

Post by Mark McDougall

What you ask is all-but-impossible. It's difficult enough to recover
functionality by manually disassembling code, let-alone writing a
program to do it automatically.

Oh, I dunno. As long as the code isn't doing silly things like being
self-modifying it's not too hard to write a disassembler which
translates each instruction into C code which emulates it. Of course,
it will be massive and totally unreadable, but it wasn't specified that
it had to be either small or readable. For that matter an answer would
be to have an emulator in C which just contained the original program as
its image.

Post by Mark McDougall
If this is a commercial product then your only option is to pay someone
to do it for you.

Pay them lots. First they'd have to disassemble it, then work out what
it was supposed to be doing. Cheaper to rewrite it from scratch. My
rates start at 35 pounds per hour (plus VAT, England expects every man
to pay her duty)...

Chris C

Rainer Buchty

2006-11-03 13:32:06 UTC

Permalink

In article <***@ccserver.keris.net>,
Chris Croughton <***@keristor.net> writes:
|> Oh, I dunno. As long as the code isn't doing silly things like being
|> self-modifying it's not too hard to write a disassembler which
|> translates each instruction into C code which emulates it.

I'd like to see a translation tool handle this:

JSR h84AE
fdb $0B6D

Where within the routine (among other things) this happens:

PSHS U,X,B,A,CC
LDX +$07,S
LDU ,X++
STX +$07,S

(Found in the Ensoniq ESQ1 and SQ80 OS's task handler...)

Not self-modifying, but definitely requires more than 1:1 translation to
finally reach a "void call_84ae(0x0b6d)".

Rainer

Chris Croughton

2006-11-05 13:20:19 UTC

Permalink

On Fri, 3 Nov 2006 13:32:06 +0000 (UTC), Rainer Buchty

Post by Rainer Buchty
|> Oh, I dunno. As long as the code isn't doing silly things like being
|> self-modifying it's not too hard to write a disassembler which
|> translates each instruction into C code which emulates it.
JSR h84AE
fdb $0B6D
PSHS U,X,B,A,CC
LDX +$07,S
LDU ,X++
STX +$07,S
(Found in the Ensoniq ESQ1 and SQ80 OS's task handler...)
Not self-modifying, but definitely requires more than 1:1 translation to
finally reach a "void call_84ae(0x0b6d)".

Well, I said "silly things like", computed jumps and calls and playing
around with the stack among them. Although even then if the destination
is valid you could do it with a big enough switch statement. Basically,
if a C compiler generated the code then you stand a chance, if it was an
assembler programmer being 'clever' you don't. Of course, whether
writing a program to do such translation would be at all useful except
as an exercise in doing weird things is debatable.

Chris C

Rainer Buchty

2006-11-03 13:15:19 UTC

Permalink

In article <454aef80$0$8052$***@per-qv1-newsreader-01.iinet.net.au>,
Mark McDougall <***@vl.com.au> writes:
|> What you ask is all-but-impossible.

Not really. But what comes out of such an automatic conversion wouldn't
be any more readable than assembly language.

Most likely he's misled by gdb outputs of binaries compiled with -g option.

Rainer

Dick Georgeson

2006-11-04 00:42:27 UTC

Permalink

We have evidence that on Fri, 03 Nov 2006 07:27:50 +0000, Mark McDougall

Post by Mark McDougall

What you ask is all-but-impossible. It's difficult enough to recover
functionality by manually disassembling code, let-alone writing a
program to do it automatically.
If this is a commercial product then your only option is to pay someone
to do it for you.

First find your masochist!

The problem is that compiling to assembler is a lossy process. Obviously
you lose all the variable and function names which usually give clues to
what the variable/function does but you're also asking to reconstruct (eg)
for/while loops which may have the conditional test at the beginning or
the end depending on the compiler, or switch statements which, if there
are a lot of cases, come out as jump tables.

I used to use gcc (with a non GNU assembler) and debug using assembler
listings with C code interspersed; often it was quite difficult to relate
the assembler to the C even then. Apparently innocent C source lines could
generate a raft of assembler or practically nothing, the degree of either
depending on how much optimisation it had been compiled with

--
Dick Georgeson
Whenever you find that you are on the side of the majority, it is time
to reform. -- Mark Twain

Rainer Buchty

2006-11-04 00:57:07 UTC

Permalink

In article <***@nospam.zetnet.co.uk>,
Dick Georgeson <***@nospam.zetnet.co.uk> writes:
|> First find your masochist!

BTDT. Can be fun.

If you like solving a 10.000 piece jigsaw puzzle.

With no picture printed on it. And all shapes looking almost the same.

|> I used to use gcc (with a non GNU assembler) and debug using assembler
|> listings with C code interspersed; often it was quite difficult to relate
|> the assembler to the C even then. Apparently innocent C source lines could
|> generate a raft of assembler or practically nothing, the degree of either
|> depending on how much optimisation it had been compiled with

Agreed, trying to grok compiler-generated code can be a really hard task.

I remember a discussion on the avr-gcc mailing list where the compiler
spit out something which at first glance was suspected to be buggy but then
turned out to be a really clever optimization -- which no human would have
ever used, because it was just to far off thinking.

Converting human-generated code, instead, is comparably easy, because
whatever optimization was done was most likely still is comprehensible,
whereas modern optimized compiler-generated code is sometimes beyond
recognition and takes rather hard work to decipher.

Rainer

Chris Croughton

2006-11-05 13:33:09 UTC

Permalink

On Sat, 4 Nov 2006 00:57:07 +0000 (UTC), Rainer Buchty

Post by Rainer Buchty
Agreed, trying to grok compiler-generated code can be a really hard task.

Trying to work out why the compiler did it (without access to the
compiler source code) is fun as well. For some value of 'fun'.

Post by Rainer Buchty
I remember a discussion on the avr-gcc mailing list where the compiler
spit out something which at first glance was suspected to be buggy but then
turned out to be a really clever optimization -- which no human would have
ever used, because it was just to far off thinking.

Optimisers frequently do things like eliminating whole chunks of code
which they 'know' can't be executed. They don't always get it right, it
can be most dinconcerting to look at the assembler (or core dump) and
see half the function missing!

Post by Rainer Buchty
Converting human-generated code, instead, is comparably easy, because
whatever optimization was done was most likely still is comprehensible,
whereas modern optimized compiler-generated code is sometimes beyond
recognition and takes rather hard work to decipher.

Hand-optimised assembler is often worse. At least compilers have to
obey some common contraints (consistent order of parameters on stack or
in registers, often consistent stack frames, etc.), whereas assembler
programmers are free to do anything they want, including using
instructions as data or vice versa. Or running code on the stack.

Changing processors, there was a fairly common piece of PDP-11 code
which went:

ENTRY0: MOV *(PC++), *(PC++)
ENTRY1: CLR *(PC++)
FLAG: DW 0
...

(syntax modified). If called as ENTRY0, the flas was non-zero, if
called as ENTRY1 it was zero. Easy for a human to recognise, impossible
to translate back into a high-level language automatically at all
sensibly (few HLLs have multiple entry points). And it didn't actually
modify the code (code space, yes, but there was no distinction between
code and data spaces).

Chris C

Everett M. Greene

2006-11-04 16:54:56 UTC

Permalink

Post by Dick Georgeson

Post by Mark McDougall

What you ask is all-but-impossible. It's difficult enough to recover
functionality by manually disassembling code, let-alone writing a
program to do it automatically.
If this is a commercial product then your only option is to pay someone
to do it for you.

First find your masochist!
The problem is that compiling to assembler is a lossy process. Obviously
you lose all the variable and function names which usually give clues to
what the variable/function does but you're also asking to reconstruct (eg)
for/while loops which may have the conditional test at the beginning or
the end depending on the compiler, or switch statements which, if there
are a lot of cases, come out as jump tables.
I used to use gcc (with a non GNU assembler) and debug using assembler
listings with C code interspersed; often it was quite difficult to relate
the assembler to the C even then. Apparently innocent C source lines could
generate a raft of assembler or practically nothing, the degree of either
depending on how much optimisation it had been compiled with

All the above is true, but you overstate the case. My
experience has been that extremely tricky code sequences
are extremely rare. More often what will be found is
very poor code sequences produced by compilers.

A good disassembler will go a long way toward solving
the translation to a high-level language (especially
back to C). It's possible for a disassembler to
distinguish between instructions and data and thus
not produce gibberish by trying to interpret text and
other data as instructions.

Beyond that, a lot of manual work is involved...

MagerValp

2006-11-05 08:51:47 UTC

Permalink

am> Need a converter / translator that can translate my 6809 assembly to C
am> code. Tried with disassemblies. But of no use. Any help in this regard
am> would be grateful. If there is any commercial tool already available
am> that is affordable would also be fine.

MM> What you ask is all-but-impossible. It's difficult enough to
MM> recover functionality by manually disassembling code, let-alone
MM> writing a program to do it automatically.

One option is to keep the original binary code, and run it with a
custom 6809 emulator. You can replace sections of 6809 code with
native C code, starting with the I/O routines. I have successfully
ported 6502 code this way, and it runs pretty well, all things
considered.

--
___ . . . . . + . . o
_|___|_ + . + . + . Per Olofsson, arkadspelare
o-o . . . o + ***@cling.gu.se
- + + . http://www.cling.gu.se/~cl3polof/

Mark McDougall

2006-11-05 23:57:03 UTC

Permalink

Post by MagerValp
One option is to keep the original binary code, and run it with a
custom 6809 emulator. You can replace sections of 6809 code with
native C code, starting with the I/O routines. I have successfully
ported 6502 code this way, and it runs pretty well, all things
considered.

The OP didn't specify why they needed the conversion, and what the
intention was to do with the output. I suspect it was to either modify
the behaviour on the current hardware or port it to another
similarly-powered platform.

If the intention is the former, then I'm afraid they're SOL.

If it is the latter, then emulation may indeed be an option - from pure
software emulation on a more powerful platform to emulation of the 6809
within an FPGA.

Regards,

--
Mark McDougall, Engineer
Virtual Logic Pty Ltd, <http://www.vl.com.au>
21-25 King St, Rockdale, 2216
Ph: +612-9599-3255 Fax: +612-9599-3266

r***@gmail.com

2017-05-23 21:22:08 UTC

Permalink

I was just on a web-site for I think a CA based company that sell a 6809 asm to C converter for $875 or will translate a million lines for $5,000. Sorry, I can't find it again now - too much security in the browser I was using!

g***@gtoal.com

2020-04-15 16:47:55 UTC

Permalink

I know this is an ancient thread, but if this is still of interest to anyone, have a look at http://gtoal.com/vectrex/6809sbt/6809.c.html and associated files in that directory. May help.